blob: 3150d9bcc836668741d4646b53240348d2498d02 [file] [log] [blame]
////
Copyright 2011-2016 Beman Dawes
Distributed under the Boost Software License, Version 1.0.
(http://www.boost.org/LICENSE_1_0.txt)
////
[#buffers]
# Endian Buffer Types
:idprefix: buffers_
## Introduction
The internal byte order of arithmetic types is traditionally called
*endianness*. See the http://en.wikipedia.org/wiki/Endian[Wikipedia] for a full
exploration of *endianness*, including definitions of *big endian* and *little
endian*.
Header `boost/endian/buffers.hpp` provides `endian_buffer`, a portable endian
integer binary buffer class template with control over byte order, value type,
size, and alignment independent of the platform's native endianness. Typedefs
provide easy-to-use names for common configurations.
Use cases primarily involve data portability, either via files or network
connections, but these byte-holders may also be used to reduce memory use, file
size, or network activity since they provide binary numeric sizes not otherwise
available.
Class `endian_buffer` is aimed at users who wish explicit control over when
endianness conversions occur. It also serves as the base class for the
<<arithmetic,endian_arithmetic>> class template, which is aimed at users who
wish fully automatic endianness conversion and direct support for all normal
arithmetic operations.
## Example
The `example/endian_example.cpp` program writes a binary file containing
four-byte, big-endian and little-endian integers:
```
#include <iostream>
#include <cstdio>
#include <boost/endian/buffers.hpp> // see Synopsis below
#include <boost/static_assert.hpp>
using namespace boost::endian;
namespace
{
// This is an extract from a very widely used GIS file format.
// Why the designer decided to mix big and little endians in
// the same file is not known. But this is a real-world format
// and users wishing to write low level code manipulating these
// files have to deal with the mixed endianness.
struct header
{
big_int32_buf_t file_code;
big_int32_buf_t file_length;
little_int32_buf_t version;
little_int32_buf_t shape_type;
};
const char* filename = "test.dat";
}
int main(int, char* [])
{
header h;
BOOST_STATIC_ASSERT(sizeof(h) == 16U); // reality check
h.file_code = 0x01020304;
h.file_length = sizeof(header);
h.version = 1;
h.shape_type = 0x01020304;
// Low-level I/O such as POSIX read/write or <cstdio>
// fread/fwrite is sometimes used for binary file operations
// when ultimate efficiency is important. Such I/O is often
// performed in some C++ wrapper class, but to drive home the
// point that endian integers are often used in fairly
// low-level code that does bulk I/O operations, <cstdio>
// fopen/fwrite is used for I/O in this example.
std::FILE* fi = std::fopen(filename, "wb"); // MUST BE BINARY
if (!fi)
{
std::cout << "could not open " << filename << '\n';
return 1;
}
if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
{
std::cout << "write failure for " << filename << '\n';
return 1;
}
std::fclose(fi);
std::cout << "created file " << filename << '\n';
return 0;
}
```
After compiling and executing `example/endian_example.cpp`, a hex dump of
`test.dat` shows:
```
01020304 00000010 01000000 04030201
```
Notice that the first two 32-bit integers are big endian while the second two
are little endian, even though the machine this was compiled and run on was
little endian.
## Limitations
Requires `<climits>`, `CHAR_BIT == 8`. If `CHAR_BIT` is some other value,
compilation will result in an `#error`. This restriction is in place because the
design, implementation, testing, and documentation has only considered issues
related to 8-bit bytes, and there have been no real-world use cases presented
for other sizes.
In {cpp}03, `endian_buffer` does not meet the requirements for POD types because
it has constructors and a private data member. This means that
common use cases are relying on unspecified behavior in that the {cpp} Standard
does not guarantee memory layout for non-POD types. This has not been a problem
in practice since all known {cpp} compilers lay out memory as if `endian` were
a POD type. In {cpp}11, it is possible to specify the default constructor as
trivial, and private data members and base classes no longer disqualify a type
from being a POD type. Thus under {cpp}11, `endian_buffer` will no longer be
relying on unspecified behavior.
## Feature set
* Big endian| little endian | native endian byte ordering.
* Signed | unsigned
* Unaligned | aligned
* 1-8 byte (unaligned) | 1, 2, 4, 8 byte (aligned)
* Choice of value type
## Enums and typedefs
Two scoped enums are provided:
```
enum class order { big, little, native };
enum class align { no, yes };
```
One class template is provided:
```
template <order Order, typename T, std::size_t Nbits,
align Align = align::no>
class endian_buffer;
```
Typedefs, such as `big_int32_buf_t`, provide convenient naming conventions for
common use cases:
[%header,cols=5*]
|===
|Name |Alignment |Endianness |Sign |Sizes in bits (n)
|`big_intN_buf_t` |no |big |signed |8,16,24,32,40,48,56,64
|`big_uintN_buf_t` |no |big |unsigned |8,16,24,32,40,48,56,64
|`little_intN_buf_t` |no |little |signed |8,16,24,32,40,48,56,64
|`little_uintN_buf_t` |no |little |unsigned |8,16,24,32,40,48,56,64
|`native_intN_buf_t` |no |native |signed |8,16,24,32,40,48,56,64
|`native_uintN_buf_t` |no |native |unsigned |8,16,24,32,40,48,56,64
|`big_intN_buf_at` |yes |big |signed |8,16,32,64
|`big_uintN_buf_at` |yes |big |unsigned |8,16,32,64
|`little_intN_buf_at` |yes |little |signed |8,16,32,64
|`little_uintN_buf_at` |yes |little |unsigned |8,16,32,64
|===
The unaligned types do not cause compilers to insert padding bytes in classes
and structs. This is an important characteristic that can be exploited to
minimize wasted space in memory, files, and network transmissions.
CAUTION: Code that uses aligned types is possibly non-portable because alignment
requirements vary between hardware architectures and because alignment may be
affected by compiler switches or pragmas. For example, alignment of an 64-bit
integer may be to a 32-bit boundary on a 32-bit machine and to a 64-bit boundary
on a 64-bit machine. Furthermore, aligned types are only available on
architectures with 8, 16, 32, and 64-bit integer types.
TIP: Prefer unaligned buffer types.
TIP: Protect yourself against alignment ills. For example:
[none]
{blank}::
+
```
static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong");
```
Note: One-byte big and little buffer types have identical layout on all
platforms, so they never actually reverse endianness. They are provided to
enable generic code, and to improve code readability and searchability.
## Class template `endian_buffer`
An `endian_buffer` is a byte-holder for arithmetic types with
user-specified endianness, value type, size, and alignment.
### Synopsis
```
namespace boost
{
namespace endian
{
// C++11 features emulated if not available
enum class align { no, yes };
template <order Order, class T, std::size_t Nbits,
align Align = align::no>
class endian_buffer
{
public:
typedef T value_type;
endian_buffer() noexcept = default;
explicit endian_buffer(T v) noexcept;
endian_buffer& operator=(T v) noexcept;
value_type value() const noexcept;
unsigned char* data() noexcept;
unsigned char const* data() const noexcept;
private:
unsigned char value_[Nbits / CHAR_BIT]; // exposition only
};
// stream inserter
template <class charT, class traits, order Order, class T,
std::size_t n_bits, align Align>
std::basic_ostream<charT, traits>&
operator<<(std::basic_ostream<charT, traits>& os,
const endian_buffer<Order, T, n_bits, Align>& x);
// stream extractor
template <class charT, class traits, order Order, class T,
std::size_t n_bits, align A>
std::basic_istream<charT, traits>&
operator>>(std::basic_istream<charT, traits>& is,
endian_buffer<Order, T, n_bits, Align>& x);
// typedefs
// unaligned big endian signed integer buffers
typedef endian_buffer<order::big, int_least8_t, 8> big_int8_buf_t;
typedef endian_buffer<order::big, int_least16_t, 16> big_int16_buf_t;
typedef endian_buffer<order::big, int_least32_t, 24> big_int24_buf_t;
typedef endian_buffer<order::big, int_least32_t, 32> big_int32_buf_t;
typedef endian_buffer<order::big, int_least64_t, 40> big_int40_buf_t;
typedef endian_buffer<order::big, int_least64_t, 48> big_int48_buf_t;
typedef endian_buffer<order::big, int_least64_t, 56> big_int56_buf_t;
typedef endian_buffer<order::big, int_least64_t, 64> big_int64_buf_t;
// unaligned big endian unsigned integer buffers
typedef endian_buffer<order::big, uint_least8_t, 8> big_uint8_buf_t;
typedef endian_buffer<order::big, uint_least16_t, 16> big_uint16_buf_t;
typedef endian_buffer<order::big, uint_least32_t, 24> big_uint24_buf_t;
typedef endian_buffer<order::big, uint_least32_t, 32> big_uint32_buf_t;
typedef endian_buffer<order::big, uint_least64_t, 40> big_uint40_buf_t;
typedef endian_buffer<order::big, uint_least64_t, 48> big_uint48_buf_t;
typedef endian_buffer<order::big, uint_least64_t, 56> big_uint56_buf_t;
typedef endian_buffer<order::big, uint_least64_t, 64> big_uint64_buf_t;
// unaligned big endian floating point buffers
typedef endian_buffer<order::big, float, 32> big_float32_buf_t;
typedef endian_buffer<order::big, double, 64> big_float64_buf_t;
// unaligned little endian signed integer buffers
typedef endian_buffer<order::little, int_least8_t, 8> little_int8_buf_t;
typedef endian_buffer<order::little, int_least16_t, 16> little_int16_buf_t;
typedef endian_buffer<order::little, int_least32_t, 24> little_int24_buf_t;
typedef endian_buffer<order::little, int_least32_t, 32> little_int32_buf_t;
typedef endian_buffer<order::little, int_least64_t, 40> little_int40_buf_t;
typedef endian_buffer<order::little, int_least64_t, 48> little_int48_buf_t;
typedef endian_buffer<order::little, int_least64_t, 56> little_int56_buf_t;
typedef endian_buffer<order::little, int_least64_t, 64> little_int64_buf_t;
// unaligned little endian unsigned integer buffers
typedef endian_buffer<order::little, uint_least8_t, 8> little_uint8_buf_t;
typedef endian_buffer<order::little, uint_least16_t, 16> little_uint16_buf_t;
typedef endian_buffer<order::little, uint_least32_t, 24> little_uint24_buf_t;
typedef endian_buffer<order::little, uint_least32_t, 32> little_uint32_buf_t;
typedef endian_buffer<order::little, uint_least64_t, 40> little_uint40_buf_t;
typedef endian_buffer<order::little, uint_least64_t, 48> little_uint48_buf_t;
typedef endian_buffer<order::little, uint_least64_t, 56> little_uint56_buf_t;
typedef endian_buffer<order::little, uint_least64_t, 64> little_uint64_buf_t;
// unaligned little endian floating point buffers
typedef endian_buffer<order::little, float, 32> little_float32_buf_t;
typedef endian_buffer<order::little, double, 64> little_float64_buf_t;
// unaligned native endian signed integer types
typedef endian_buffer<order::native, int_least8_t, 8> native_int8_buf_t;
typedef endian_buffer<order::native, int_least16_t, 16> native_int16_buf_t;
typedef endian_buffer<order::native, int_least32_t, 24> native_int24_buf_t;
typedef endian_buffer<order::native, int_least32_t, 32> native_int32_buf_t;
typedef endian_buffer<order::native, int_least64_t, 40> native_int40_buf_t;
typedef endian_buffer<order::native, int_least64_t, 48> native_int48_buf_t;
typedef endian_buffer<order::native, int_least64_t, 56> native_int56_buf_t;
typedef endian_buffer<order::native, int_least64_t, 64> native_int64_buf_t;
// unaligned native endian unsigned integer types
typedef endian_buffer<order::native, uint_least8_t, 8> native_uint8_buf_t;
typedef endian_buffer<order::native, uint_least16_t, 16> native_uint16_buf_t;
typedef endian_buffer<order::native, uint_least32_t, 24> native_uint24_buf_t;
typedef endian_buffer<order::native, uint_least32_t, 32> native_uint32_buf_t;
typedef endian_buffer<order::native, uint_least64_t, 40> native_uint40_buf_t;
typedef endian_buffer<order::native, uint_least64_t, 48> native_uint48_buf_t;
typedef endian_buffer<order::native, uint_least64_t, 56> native_uint56_buf_t;
typedef endian_buffer<order::native, uint_least64_t, 64> native_uint64_buf_t;
// unaligned native endian floating point types
typedef endian_buffer<order::native, float, 32> native_float32_buf_t;
typedef endian_buffer<order::native, double, 64> native_float64_buf_t;
// aligned big endian signed integer buffers
typedef endian_buffer<order::big, int8_t, 8, align::yes> big_int8_buf_at;
typedef endian_buffer<order::big, int16_t, 16, align::yes> big_int16_buf_at;
typedef endian_buffer<order::big, int32_t, 32, align::yes> big_int32_buf_at;
typedef endian_buffer<order::big, int64_t, 64, align::yes> big_int64_buf_at;
// aligned big endian unsigned integer buffers
typedef endian_buffer<order::big, uint8_t, 8, align::yes> big_uint8_buf_at;
typedef endian_buffer<order::big, uint16_t, 16, align::yes> big_uint16_buf_at;
typedef endian_buffer<order::big, uint32_t, 32, align::yes> big_uint32_buf_at;
typedef endian_buffer<order::big, uint64_t, 64, align::yes> big_uint64_buf_at;
// aligned big endian floating point buffers
typedef endian_buffer<order::big, float, 32, align::yes> big_float32_buf_at;
typedef endian_buffer<order::big, double, 64, align::yes> big_float64_buf_at;
// aligned little endian signed integer buffers
typedef endian_buffer<order::little, int8_t, 8, align::yes> little_int8_buf_at;
typedef endian_buffer<order::little, int16_t, 16, align::yes> little_int16_buf_at;
typedef endian_buffer<order::little, int32_t, 32, align::yes> little_int32_buf_at;
typedef endian_buffer<order::little, int64_t, 64, align::yes> little_int64_buf_at;
// aligned little endian unsigned integer buffers
typedef endian_buffer<order::little, uint8_t, 8, align::yes> little_uint8_buf_at;
typedef endian_buffer<order::little, uint16_t, 16, align::yes> little_uint16_buf_at;
typedef endian_buffer<order::little, uint32_t, 32, align::yes> little_uint32_buf_at;
typedef endian_buffer<order::little, uint64_t, 64, align::yes> little_uint64_buf_at;
// aligned little endian floating point buffers
typedef endian_buffer<order::little, float, 32, align::yes> little_float32_buf_at;
typedef endian_buffer<order::little, double, 64, align::yes> little_float64_buf_at;
// aligned native endian typedefs are not provided because
// <cstdint> types are superior for this use case
} // namespace endian
} // namespace boost
```
The expository data member `value_` stores the current value of the
`endian_buffer` object as a sequence of bytes ordered as specified by the
`Order` template parameter. The `CHAR_BIT` macro is defined in `<climits>`.
The only supported value of `CHAR_BIT` is 8.
The valid values of `Nbits` are as follows:
* When `sizeof(T)` is 1, `Nbits` shall be 8;
* When `sizeof(T)` is 2, `Nbits` shall be 16;
* When `sizeof(T)` is 4, `Nbits` shall be 24 or 32;
* When `sizeof(T)` is 8, `Nbits` shall be 40, 48, 56, or 64.
Other values of `sizeof(T)` are not supported.
When `Nbits` is equal to `sizeof(T)*8`, `T` must be a trivially copyable type
(such as `float`) that is assumed to have the same endianness as `uintNbits_t`.
When `Nbits` is less than `sizeof(T)*8`, `T` must be either a standard integral
type ({cpp}std, [basic.fundamental]) or an `enum`.
### Members
```
endian_buffer() noexcept = default;
```
[none]
* {blank}
+
Effects:: Constructs an uninitialized object.
```
explicit endian_buffer(T v) noexcept;
```
[none]
* {blank}
+
Effects:: `endian_store<T, Nbits/8, Order>( value_, v )`.
```
endian_buffer& operator=(T v) noexcept;
```
[none]
* {blank}
+
Effects:: `endian_store<T, Nbits/8, Order>( value_, v )`.
Returns:: `*this`.
```
value_type value() const noexcept;
```
[none]
* {blank}
+
Returns:: `endian_load<T, Nbits/8, Order>( value_ )`.
```
unsigned char* data() noexcept;
```
```
unsigned char const* data() const noexcept;
```
[none]
* {blank}
+
Returns::
A pointer to the first byte of `value_`.
### Non-member functions
```
template <class charT, class traits, order Order, class T,
std::size_t n_bits, align Align>
std::basic_ostream<charT, traits>& operator<<(std::basic_ostream<charT, traits>& os,
const endian_buffer<Order, T, n_bits, Align>& x);
```
[none]
* {blank}
+
Returns:: `os << x.value()`.
```
template <class charT, class traits, order Order, class T,
std::size_t n_bits, align A>
std::basic_istream<charT, traits>& operator>>(std::basic_istream<charT, traits>& is,
endian_buffer<Order, T, n_bits, Align>& x);
```
[none]
* {blank}
+
Effects:: As if:
+
```
T i;
if (is >> i)
x = i;
```
Returns:: `is`.
## FAQ
See the <<overview_faq,Overview FAQ>> for a library-wide FAQ.
Why not just use Boost.Serialization?::
Serialization involves a conversion for every object involved in I/O. Endian
integers require no conversion or copying. They are already in the desired
format for binary I/O. Thus they can be read or written in bulk.
Are endian types PODs?::
Yes for {cpp}11. No for {cpp}03, although several
<<buffers_compilation,macros>> are available to force PODness in all cases.
What are the implications of endian integer types not being PODs with {cpp}03 compilers?::
They can't be used in unions. Also, compilers aren't required to align or lay
out storage in portable ways, although this potential problem hasn't prevented
use of Boost.Endian with real compilers.
What good is native endianness?::
It provides alignment and size guarantees not available from the built-in
types. It eases generic programming.
Why bother with the aligned endian types?::
Aligned integer operations may be faster (as much as 10 to 20 times faster) if
the endianness and alignment of the type matches the endianness and alignment
requirements of the machine. The code, however, is likely to be somewhat less
portable than with the unaligned types.
## Design considerations for Boost.Endian buffers
* Must be suitable for I/O - in other words, must be memcpyable.
* Must provide exactly the size and internal byte ordering specified.
* Must work correctly when the internal integer representation has more bits
that the sum of the bits in the external byte representation. Sign extension
must work correctly when the internal integer representation type has more
bits than the sum of the bits in the external bytes. For example, using
a 64-bit integer internally to represent 40-bit (5 byte) numbers must work for
both positive and negative values.
* Must work correctly (including using the same defined external
representation) regardless of whether a compiler treats char as signed or
unsigned.
* Unaligned types must not cause compilers to insert padding bytes.
* The implementation should supply optimizations with great care. Experience
has shown that optimizations of endian integers often become pessimizations
when changing machines or compilers. Pessimizations can also happen when
changing compiler switches, compiler versions, or CPU models of the same
architecture.
## {cpp}11
The availability of the {cpp}11
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
Functions] feature is detected automatically, and will be used if present to
ensure that objects of `class endian_buffer` are trivial, and thus
PODs.
## Compilation
Boost.Endian is implemented entirely within headers, with no need to link to
any Boost object libraries.
Several macros allow user control over features:
* `BOOST_ENDIAN_NO_CTORS` causes `class endian_buffer` to have no
constructors. The intended use is for compiling user code that must be
portable between compilers regardless of {cpp}11
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
Functions] support. Use of constructors will always fail,
* `BOOST_ENDIAN_FORCE_PODNESS` causes `BOOST_ENDIAN_NO_CTORS` to be defined if
the compiler does not support {cpp}11
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
Functions]. This is ensures that objects of `class endian_buffer` are PODs, and
so can be used in {cpp}03 unions. In {cpp}11, `class endian_buffer` objects are
PODs, even though they have constructors, so can always be used in unions.