| //// |
| Copyright 2011-2016 Beman Dawes |
| |
| Distributed under the Boost Software License, Version 1.0. |
| (http://www.boost.org/LICENSE_1_0.txt) |
| //// |
| |
| [#buffers] |
| # Endian Buffer Types |
| :idprefix: buffers_ |
| |
| ## Introduction |
| |
| The internal byte order of arithmetic types is traditionally called |
| *endianness*. See the http://en.wikipedia.org/wiki/Endian[Wikipedia] for a full |
| exploration of *endianness*, including definitions of *big endian* and *little |
| endian*. |
| |
| Header `boost/endian/buffers.hpp` provides `endian_buffer`, a portable endian |
| integer binary buffer class template with control over byte order, value type, |
| size, and alignment independent of the platform's native endianness. Typedefs |
| provide easy-to-use names for common configurations. |
| |
| Use cases primarily involve data portability, either via files or network |
| connections, but these byte-holders may also be used to reduce memory use, file |
| size, or network activity since they provide binary numeric sizes not otherwise |
| available. |
| |
| Class `endian_buffer` is aimed at users who wish explicit control over when |
| endianness conversions occur. It also serves as the base class for the |
| <<arithmetic,endian_arithmetic>> class template, which is aimed at users who |
| wish fully automatic endianness conversion and direct support for all normal |
| arithmetic operations. |
| |
| ## Example |
| |
| The `example/endian_example.cpp` program writes a binary file containing |
| four-byte, big-endian and little-endian integers: |
| |
| ``` |
| #include <iostream> |
| #include <cstdio> |
| #include <boost/endian/buffers.hpp> // see Synopsis below |
| #include <boost/static_assert.hpp> |
| |
| using namespace boost::endian; |
| |
| namespace |
| { |
| // This is an extract from a very widely used GIS file format. |
| // Why the designer decided to mix big and little endians in |
| // the same file is not known. But this is a real-world format |
| // and users wishing to write low level code manipulating these |
| // files have to deal with the mixed endianness. |
| |
| struct header |
| { |
| big_int32_buf_t file_code; |
| big_int32_buf_t file_length; |
| little_int32_buf_t version; |
| little_int32_buf_t shape_type; |
| }; |
| |
| const char* filename = "test.dat"; |
| } |
| |
| int main(int, char* []) |
| { |
| header h; |
| |
| BOOST_STATIC_ASSERT(sizeof(h) == 16U); // reality check |
| |
| h.file_code = 0x01020304; |
| h.file_length = sizeof(header); |
| h.version = 1; |
| h.shape_type = 0x01020304; |
| |
| // Low-level I/O such as POSIX read/write or <cstdio> |
| // fread/fwrite is sometimes used for binary file operations |
| // when ultimate efficiency is important. Such I/O is often |
| // performed in some C++ wrapper class, but to drive home the |
| // point that endian integers are often used in fairly |
| // low-level code that does bulk I/O operations, <cstdio> |
| // fopen/fwrite is used for I/O in this example. |
| |
| std::FILE* fi = std::fopen(filename, "wb"); // MUST BE BINARY |
| |
| if (!fi) |
| { |
| std::cout << "could not open " << filename << '\n'; |
| return 1; |
| } |
| |
| if (std::fwrite(&h, sizeof(header), 1, fi) != 1) |
| { |
| std::cout << "write failure for " << filename << '\n'; |
| return 1; |
| } |
| |
| std::fclose(fi); |
| |
| std::cout << "created file " << filename << '\n'; |
| |
| return 0; |
| } |
| ``` |
| |
| After compiling and executing `example/endian_example.cpp`, a hex dump of |
| `test.dat` shows: |
| |
| ``` |
| 01020304 00000010 01000000 04030201 |
| ``` |
| |
| Notice that the first two 32-bit integers are big endian while the second two |
| are little endian, even though the machine this was compiled and run on was |
| little endian. |
| |
| ## Limitations |
| |
| Requires `<climits>`, `CHAR_BIT == 8`. If `CHAR_BIT` is some other value, |
| compilation will result in an `#error`. This restriction is in place because the |
| design, implementation, testing, and documentation has only considered issues |
| related to 8-bit bytes, and there have been no real-world use cases presented |
| for other sizes. |
| |
| In {cpp}03, `endian_buffer` does not meet the requirements for POD types because |
| it has constructors and a private data member. This means that |
| common use cases are relying on unspecified behavior in that the {cpp} Standard |
| does not guarantee memory layout for non-POD types. This has not been a problem |
| in practice since all known {cpp} compilers lay out memory as if `endian` were |
| a POD type. In {cpp}11, it is possible to specify the default constructor as |
| trivial, and private data members and base classes no longer disqualify a type |
| from being a POD type. Thus under {cpp}11, `endian_buffer` will no longer be |
| relying on unspecified behavior. |
| |
| ## Feature set |
| |
| * Big endian| little endian | native endian byte ordering. |
| * Signed | unsigned |
| * Unaligned | aligned |
| * 1-8 byte (unaligned) | 1, 2, 4, 8 byte (aligned) |
| * Choice of value type |
| |
| ## Enums and typedefs |
| |
| Two scoped enums are provided: |
| |
| ``` |
| enum class order { big, little, native }; |
| |
| enum class align { no, yes }; |
| ``` |
| |
| One class template is provided: |
| |
| ``` |
| template <order Order, typename T, std::size_t Nbits, |
| align Align = align::no> |
| class endian_buffer; |
| ``` |
| |
| Typedefs, such as `big_int32_buf_t`, provide convenient naming conventions for |
| common use cases: |
| |
| [%header,cols=5*] |
| |=== |
| |Name |Alignment |Endianness |Sign |Sizes in bits (n) |
| |`big_intN_buf_t` |no |big |signed |8,16,24,32,40,48,56,64 |
| |`big_uintN_buf_t` |no |big |unsigned |8,16,24,32,40,48,56,64 |
| |`little_intN_buf_t` |no |little |signed |8,16,24,32,40,48,56,64 |
| |`little_uintN_buf_t` |no |little |unsigned |8,16,24,32,40,48,56,64 |
| |`native_intN_buf_t` |no |native |signed |8,16,24,32,40,48,56,64 |
| |`native_uintN_buf_t` |no |native |unsigned |8,16,24,32,40,48,56,64 |
| |`big_intN_buf_at` |yes |big |signed |8,16,32,64 |
| |`big_uintN_buf_at` |yes |big |unsigned |8,16,32,64 |
| |`little_intN_buf_at` |yes |little |signed |8,16,32,64 |
| |`little_uintN_buf_at` |yes |little |unsigned |8,16,32,64 |
| |=== |
| |
| The unaligned types do not cause compilers to insert padding bytes in classes |
| and structs. This is an important characteristic that can be exploited to |
| minimize wasted space in memory, files, and network transmissions. |
| |
| CAUTION: Code that uses aligned types is possibly non-portable because alignment |
| requirements vary between hardware architectures and because alignment may be |
| affected by compiler switches or pragmas. For example, alignment of an 64-bit |
| integer may be to a 32-bit boundary on a 32-bit machine and to a 64-bit boundary |
| on a 64-bit machine. Furthermore, aligned types are only available on |
| architectures with 8, 16, 32, and 64-bit integer types. |
| |
| TIP: Prefer unaligned buffer types. |
| |
| TIP: Protect yourself against alignment ills. For example: |
| [none] |
| {blank}:: |
| + |
| ``` |
| static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong"); |
| ``` |
| |
| Note: One-byte big and little buffer types have identical layout on all |
| platforms, so they never actually reverse endianness. They are provided to |
| enable generic code, and to improve code readability and searchability. |
| |
| ## Class template `endian_buffer` |
| |
| An `endian_buffer` is a byte-holder for arithmetic types with |
| user-specified endianness, value type, size, and alignment. |
| |
| ### Synopsis |
| |
| ``` |
| namespace boost |
| { |
| namespace endian |
| { |
| // C++11 features emulated if not available |
| |
| enum class align { no, yes }; |
| |
| template <order Order, class T, std::size_t Nbits, |
| align Align = align::no> |
| class endian_buffer |
| { |
| public: |
| |
| typedef T value_type; |
| |
| endian_buffer() noexcept = default; |
| explicit endian_buffer(T v) noexcept; |
| |
| endian_buffer& operator=(T v) noexcept; |
| value_type value() const noexcept; |
| unsigned char* data() noexcept; |
| unsigned char const* data() const noexcept; |
| |
| private: |
| |
| unsigned char value_[Nbits / CHAR_BIT]; // exposition only |
| }; |
| |
| // stream inserter |
| template <class charT, class traits, order Order, class T, |
| std::size_t n_bits, align Align> |
| std::basic_ostream<charT, traits>& |
| operator<<(std::basic_ostream<charT, traits>& os, |
| const endian_buffer<Order, T, n_bits, Align>& x); |
| |
| // stream extractor |
| template <class charT, class traits, order Order, class T, |
| std::size_t n_bits, align A> |
| std::basic_istream<charT, traits>& |
| operator>>(std::basic_istream<charT, traits>& is, |
| endian_buffer<Order, T, n_bits, Align>& x); |
| |
| // typedefs |
| |
| // unaligned big endian signed integer buffers |
| typedef endian_buffer<order::big, int_least8_t, 8> big_int8_buf_t; |
| typedef endian_buffer<order::big, int_least16_t, 16> big_int16_buf_t; |
| typedef endian_buffer<order::big, int_least32_t, 24> big_int24_buf_t; |
| typedef endian_buffer<order::big, int_least32_t, 32> big_int32_buf_t; |
| typedef endian_buffer<order::big, int_least64_t, 40> big_int40_buf_t; |
| typedef endian_buffer<order::big, int_least64_t, 48> big_int48_buf_t; |
| typedef endian_buffer<order::big, int_least64_t, 56> big_int56_buf_t; |
| typedef endian_buffer<order::big, int_least64_t, 64> big_int64_buf_t; |
| |
| // unaligned big endian unsigned integer buffers |
| typedef endian_buffer<order::big, uint_least8_t, 8> big_uint8_buf_t; |
| typedef endian_buffer<order::big, uint_least16_t, 16> big_uint16_buf_t; |
| typedef endian_buffer<order::big, uint_least32_t, 24> big_uint24_buf_t; |
| typedef endian_buffer<order::big, uint_least32_t, 32> big_uint32_buf_t; |
| typedef endian_buffer<order::big, uint_least64_t, 40> big_uint40_buf_t; |
| typedef endian_buffer<order::big, uint_least64_t, 48> big_uint48_buf_t; |
| typedef endian_buffer<order::big, uint_least64_t, 56> big_uint56_buf_t; |
| typedef endian_buffer<order::big, uint_least64_t, 64> big_uint64_buf_t; |
| |
| // unaligned big endian floating point buffers |
| typedef endian_buffer<order::big, float, 32> big_float32_buf_t; |
| typedef endian_buffer<order::big, double, 64> big_float64_buf_t; |
| |
| // unaligned little endian signed integer buffers |
| typedef endian_buffer<order::little, int_least8_t, 8> little_int8_buf_t; |
| typedef endian_buffer<order::little, int_least16_t, 16> little_int16_buf_t; |
| typedef endian_buffer<order::little, int_least32_t, 24> little_int24_buf_t; |
| typedef endian_buffer<order::little, int_least32_t, 32> little_int32_buf_t; |
| typedef endian_buffer<order::little, int_least64_t, 40> little_int40_buf_t; |
| typedef endian_buffer<order::little, int_least64_t, 48> little_int48_buf_t; |
| typedef endian_buffer<order::little, int_least64_t, 56> little_int56_buf_t; |
| typedef endian_buffer<order::little, int_least64_t, 64> little_int64_buf_t; |
| |
| // unaligned little endian unsigned integer buffers |
| typedef endian_buffer<order::little, uint_least8_t, 8> little_uint8_buf_t; |
| typedef endian_buffer<order::little, uint_least16_t, 16> little_uint16_buf_t; |
| typedef endian_buffer<order::little, uint_least32_t, 24> little_uint24_buf_t; |
| typedef endian_buffer<order::little, uint_least32_t, 32> little_uint32_buf_t; |
| typedef endian_buffer<order::little, uint_least64_t, 40> little_uint40_buf_t; |
| typedef endian_buffer<order::little, uint_least64_t, 48> little_uint48_buf_t; |
| typedef endian_buffer<order::little, uint_least64_t, 56> little_uint56_buf_t; |
| typedef endian_buffer<order::little, uint_least64_t, 64> little_uint64_buf_t; |
| |
| // unaligned little endian floating point buffers |
| typedef endian_buffer<order::little, float, 32> little_float32_buf_t; |
| typedef endian_buffer<order::little, double, 64> little_float64_buf_t; |
| |
| // unaligned native endian signed integer types |
| typedef endian_buffer<order::native, int_least8_t, 8> native_int8_buf_t; |
| typedef endian_buffer<order::native, int_least16_t, 16> native_int16_buf_t; |
| typedef endian_buffer<order::native, int_least32_t, 24> native_int24_buf_t; |
| typedef endian_buffer<order::native, int_least32_t, 32> native_int32_buf_t; |
| typedef endian_buffer<order::native, int_least64_t, 40> native_int40_buf_t; |
| typedef endian_buffer<order::native, int_least64_t, 48> native_int48_buf_t; |
| typedef endian_buffer<order::native, int_least64_t, 56> native_int56_buf_t; |
| typedef endian_buffer<order::native, int_least64_t, 64> native_int64_buf_t; |
| |
| // unaligned native endian unsigned integer types |
| typedef endian_buffer<order::native, uint_least8_t, 8> native_uint8_buf_t; |
| typedef endian_buffer<order::native, uint_least16_t, 16> native_uint16_buf_t; |
| typedef endian_buffer<order::native, uint_least32_t, 24> native_uint24_buf_t; |
| typedef endian_buffer<order::native, uint_least32_t, 32> native_uint32_buf_t; |
| typedef endian_buffer<order::native, uint_least64_t, 40> native_uint40_buf_t; |
| typedef endian_buffer<order::native, uint_least64_t, 48> native_uint48_buf_t; |
| typedef endian_buffer<order::native, uint_least64_t, 56> native_uint56_buf_t; |
| typedef endian_buffer<order::native, uint_least64_t, 64> native_uint64_buf_t; |
| |
| // unaligned native endian floating point types |
| typedef endian_buffer<order::native, float, 32> native_float32_buf_t; |
| typedef endian_buffer<order::native, double, 64> native_float64_buf_t; |
| |
| // aligned big endian signed integer buffers |
| typedef endian_buffer<order::big, int8_t, 8, align::yes> big_int8_buf_at; |
| typedef endian_buffer<order::big, int16_t, 16, align::yes> big_int16_buf_at; |
| typedef endian_buffer<order::big, int32_t, 32, align::yes> big_int32_buf_at; |
| typedef endian_buffer<order::big, int64_t, 64, align::yes> big_int64_buf_at; |
| |
| // aligned big endian unsigned integer buffers |
| typedef endian_buffer<order::big, uint8_t, 8, align::yes> big_uint8_buf_at; |
| typedef endian_buffer<order::big, uint16_t, 16, align::yes> big_uint16_buf_at; |
| typedef endian_buffer<order::big, uint32_t, 32, align::yes> big_uint32_buf_at; |
| typedef endian_buffer<order::big, uint64_t, 64, align::yes> big_uint64_buf_at; |
| |
| // aligned big endian floating point buffers |
| typedef endian_buffer<order::big, float, 32, align::yes> big_float32_buf_at; |
| typedef endian_buffer<order::big, double, 64, align::yes> big_float64_buf_at; |
| |
| // aligned little endian signed integer buffers |
| typedef endian_buffer<order::little, int8_t, 8, align::yes> little_int8_buf_at; |
| typedef endian_buffer<order::little, int16_t, 16, align::yes> little_int16_buf_at; |
| typedef endian_buffer<order::little, int32_t, 32, align::yes> little_int32_buf_at; |
| typedef endian_buffer<order::little, int64_t, 64, align::yes> little_int64_buf_at; |
| |
| // aligned little endian unsigned integer buffers |
| typedef endian_buffer<order::little, uint8_t, 8, align::yes> little_uint8_buf_at; |
| typedef endian_buffer<order::little, uint16_t, 16, align::yes> little_uint16_buf_at; |
| typedef endian_buffer<order::little, uint32_t, 32, align::yes> little_uint32_buf_at; |
| typedef endian_buffer<order::little, uint64_t, 64, align::yes> little_uint64_buf_at; |
| |
| // aligned little endian floating point buffers |
| typedef endian_buffer<order::little, float, 32, align::yes> little_float32_buf_at; |
| typedef endian_buffer<order::little, double, 64, align::yes> little_float64_buf_at; |
| |
| // aligned native endian typedefs are not provided because |
| // <cstdint> types are superior for this use case |
| |
| } // namespace endian |
| } // namespace boost |
| ``` |
| |
| The expository data member `value_` stores the current value of the |
| `endian_buffer` object as a sequence of bytes ordered as specified by the |
| `Order` template parameter. The `CHAR_BIT` macro is defined in `<climits>`. |
| The only supported value of `CHAR_BIT` is 8. |
| |
| The valid values of `Nbits` are as follows: |
| |
| * When `sizeof(T)` is 1, `Nbits` shall be 8; |
| * When `sizeof(T)` is 2, `Nbits` shall be 16; |
| * When `sizeof(T)` is 4, `Nbits` shall be 24 or 32; |
| * When `sizeof(T)` is 8, `Nbits` shall be 40, 48, 56, or 64. |
| |
| Other values of `sizeof(T)` are not supported. |
| |
| When `Nbits` is equal to `sizeof(T)*8`, `T` must be a trivially copyable type |
| (such as `float`) that is assumed to have the same endianness as `uintNbits_t`. |
| |
| When `Nbits` is less than `sizeof(T)*8`, `T` must be either a standard integral |
| type ({cpp}std, [basic.fundamental]) or an `enum`. |
| |
| ### Members |
| |
| ``` |
| endian_buffer() noexcept = default; |
| ``` |
| [none] |
| * {blank} |
| + |
| Effects:: Constructs an uninitialized object. |
| |
| ``` |
| explicit endian_buffer(T v) noexcept; |
| ``` |
| [none] |
| * {blank} |
| + |
| Effects:: `endian_store<T, Nbits/8, Order>( value_, v )`. |
| |
| ``` |
| endian_buffer& operator=(T v) noexcept; |
| ``` |
| [none] |
| * {blank} |
| + |
| Effects:: `endian_store<T, Nbits/8, Order>( value_, v )`. |
| Returns:: `*this`. |
| |
| ``` |
| value_type value() const noexcept; |
| ``` |
| [none] |
| * {blank} |
| + |
| Returns:: `endian_load<T, Nbits/8, Order>( value_ )`. |
| |
| ``` |
| unsigned char* data() noexcept; |
| ``` |
| ``` |
| unsigned char const* data() const noexcept; |
| ``` |
| [none] |
| * {blank} |
| + |
| Returns:: |
| A pointer to the first byte of `value_`. |
| |
| ### Non-member functions |
| |
| ``` |
| template <class charT, class traits, order Order, class T, |
| std::size_t n_bits, align Align> |
| std::basic_ostream<charT, traits>& operator<<(std::basic_ostream<charT, traits>& os, |
| const endian_buffer<Order, T, n_bits, Align>& x); |
| ``` |
| [none] |
| * {blank} |
| + |
| Returns:: `os << x.value()`. |
| |
| ``` |
| template <class charT, class traits, order Order, class T, |
| std::size_t n_bits, align A> |
| std::basic_istream<charT, traits>& operator>>(std::basic_istream<charT, traits>& is, |
| endian_buffer<Order, T, n_bits, Align>& x); |
| ``` |
| [none] |
| * {blank} |
| + |
| Effects:: As if: |
| + |
| ``` |
| T i; |
| if (is >> i) |
| x = i; |
| ``` |
| Returns:: `is`. |
| |
| ## FAQ |
| |
| See the <<overview_faq,Overview FAQ>> for a library-wide FAQ. |
| |
| Why not just use Boost.Serialization?:: |
| Serialization involves a conversion for every object involved in I/O. Endian |
| integers require no conversion or copying. They are already in the desired |
| format for binary I/O. Thus they can be read or written in bulk. |
| |
| Are endian types PODs?:: |
| Yes for {cpp}11. No for {cpp}03, although several |
| <<buffers_compilation,macros>> are available to force PODness in all cases. |
| |
| What are the implications of endian integer types not being PODs with {cpp}03 compilers?:: |
| They can't be used in unions. Also, compilers aren't required to align or lay |
| out storage in portable ways, although this potential problem hasn't prevented |
| use of Boost.Endian with real compilers. |
| |
| What good is native endianness?:: |
| It provides alignment and size guarantees not available from the built-in |
| types. It eases generic programming. |
| |
| Why bother with the aligned endian types?:: |
| Aligned integer operations may be faster (as much as 10 to 20 times faster) if |
| the endianness and alignment of the type matches the endianness and alignment |
| requirements of the machine. The code, however, is likely to be somewhat less |
| portable than with the unaligned types. |
| |
| ## Design considerations for Boost.Endian buffers |
| |
| * Must be suitable for I/O - in other words, must be memcpyable. |
| * Must provide exactly the size and internal byte ordering specified. |
| * Must work correctly when the internal integer representation has more bits |
| that the sum of the bits in the external byte representation. Sign extension |
| must work correctly when the internal integer representation type has more |
| bits than the sum of the bits in the external bytes. For example, using |
| a 64-bit integer internally to represent 40-bit (5 byte) numbers must work for |
| both positive and negative values. |
| * Must work correctly (including using the same defined external |
| representation) regardless of whether a compiler treats char as signed or |
| unsigned. |
| * Unaligned types must not cause compilers to insert padding bytes. |
| * The implementation should supply optimizations with great care. Experience |
| has shown that optimizations of endian integers often become pessimizations |
| when changing machines or compilers. Pessimizations can also happen when |
| changing compiler switches, compiler versions, or CPU models of the same |
| architecture. |
| |
| ## {cpp}11 |
| |
| The availability of the {cpp}11 |
| http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted |
| Functions] feature is detected automatically, and will be used if present to |
| ensure that objects of `class endian_buffer` are trivial, and thus |
| PODs. |
| |
| ## Compilation |
| |
| Boost.Endian is implemented entirely within headers, with no need to link to |
| any Boost object libraries. |
| |
| Several macros allow user control over features: |
| |
| * `BOOST_ENDIAN_NO_CTORS` causes `class endian_buffer` to have no |
| constructors. The intended use is for compiling user code that must be |
| portable between compilers regardless of {cpp}11 |
| http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted |
| Functions] support. Use of constructors will always fail, |
| * `BOOST_ENDIAN_FORCE_PODNESS` causes `BOOST_ENDIAN_NO_CTORS` to be defined if |
| the compiler does not support {cpp}11 |
| http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted |
| Functions]. This is ensures that objects of `class endian_buffer` are PODs, and |
| so can be used in {cpp}03 unions. In {cpp}11, `class endian_buffer` objects are |
| PODs, even though they have constructors, so can always be used in unions. |