blob: af726df92344d194cf712ef1534c524f5fa53cdd [file] [log] [blame]
////
Copyright 2011-2016 Beman Dawes
Distributed under the Boost Software License, Version 1.0.
(http://www.boost.org/LICENSE_1_0.txt)
////
[#arithmetic]
# Endian Arithmetic Types
:idprefix: arithmetic_
## Introduction
Header `boost/endian/arithmetic.hpp` provides integer binary types with
control over byte order, value type, size, and alignment. Typedefs provide
easy-to-use names for common configurations.
These types provide portable byte-holders for integer data, independent of
particular computer architectures. Use cases almost always involve I/O, either
via files or network connections. Although data portability is the primary
motivation, these integer byte-holders may also be used to reduce memory use,
file size, or network activity since they provide binary integer sizes not
otherwise available.
Such integer byte-holder types are traditionally called *endian* types. See the
http://en.wikipedia.org/wiki/Endian[Wikipedia] for a full exploration of
*endianness*, including definitions of *big endian* and *little endian*.
Boost endian integers provide the same full set of {cpp} assignment, arithmetic,
and relational operators as {cpp} standard integral types, with the standard
semantics.
Unary arithmetic operators are `+`, `-`, `~`, `!`, plus both prefix and postfix
`--` and `++`. Binary arithmetic operators are `+`, `+=`, `-`, `-=`, `\*`,
``*=``, `/`, `/=`, `&`, `&=`, `|`, `|=`, `^`, `^=`, `<<`, `<\<=`, `>>`, and
`>>=`. Binary relational operators are `==`, `!=`, `<`, `\<=`, `>`, and `>=`.
Implicit conversion to the underlying value type is provided. An implicit
constructor converting from the underlying value type is provided.
## Example
The `endian_example.cpp` program writes a binary file containing four-byte,
big-endian and little-endian integers:
```
#include <iostream>
#include <cstdio>
#include <boost/endian/arithmetic.hpp>
#include <boost/static_assert.hpp>
using namespace boost::endian;
namespace
{
// This is an extract from a very widely used GIS file format.
// Why the designer decided to mix big and little endians in
// the same file is not known. But this is a real-world format
// and users wishing to write low level code manipulating these
// files have to deal with the mixed endianness.
struct header
{
big_int32_t file_code;
big_int32_t file_length;
little_int32_t version;
little_int32_t shape_type;
};
const char* filename = "test.dat";
}
int main(int, char* [])
{
header h;
BOOST_STATIC_ASSERT(sizeof(h) == 16U); // reality check
h.file_code = 0x01020304;
h.file_length = sizeof(header);
h.version = 1;
h.shape_type = 0x01020304;
// Low-level I/O such as POSIX read/write or <cstdio>
// fread/fwrite is sometimes used for binary file operations
// when ultimate efficiency is important. Such I/O is often
// performed in some C++ wrapper class, but to drive home the
// point that endian integers are often used in fairly
// low-level code that does bulk I/O operations, <cstdio>
// fopen/fwrite is used for I/O in this example.
std::FILE* fi = std::fopen(filename, "wb"); // MUST BE BINARY
if (!fi)
{
std::cout << "could not open " << filename << '\n';
return 1;
}
if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
{
std::cout << "write failure for " << filename << '\n';
return 1;
}
std::fclose(fi);
std::cout << "created file " << filename << '\n';
return 0;
}
```
After compiling and executing `endian_example.cpp`, a hex dump of `test.dat`
shows:
```
01020304 00000010 01000000 04030201
```
Notice that the first two 32-bit integers are big endian while the second two
are little endian, even though the machine this was compiled and run on was
little endian.
## Limitations
Requires `<climits>`, `CHAR_BIT == 8`. If `CHAR_BIT` is some other value,
compilation will result in an `#error`. This restriction is in place because the
design, implementation, testing, and documentation has only considered issues
related to 8-bit bytes, and there have been no real-world use cases presented
for other sizes.
In {cpp}03, `endian_arithmetic` does not meet the requirements for POD types
because it has constructors, private data members, and a base class. This means
that common use cases are relying on unspecified behavior in that the {cpp}
Standard does not guarantee memory layout for non-POD types. This has not been a
problem in practice since all known {cpp} compilers lay out memory as if
`endian` were a POD type. In {cpp}11, it is possible to specify the default
constructor as trivial, and private data members and base classes no longer
disqualify a type from being a POD type. Thus under {cpp}11, `endian_arithmetic`
will no longer be relying on unspecified behavior.
## Feature set
* Big endian| little endian | native endian byte ordering.
* Signed | unsigned
* Unaligned | aligned
* 1-8 byte (unaligned) | 1, 2, 4, 8 byte (aligned)
* Choice of value type
## Enums and typedefs
Two scoped enums are provided:
```
enum class order { big, little, native };
enum class align { no, yes };
```
One class template is provided:
```
template <order Order, typename T, std::size_t n_bits,
align Align = align::no>
class endian_arithmetic;
```
Typedefs, such as `big_int32_t`, provide convenient naming conventions for
common use cases:
[%header,cols=5*]
|===
|Name |Alignment |Endianness |Sign |Sizes in bits (n)
|`big_intN_t` |no |big |signed |8,16,24,32,40,48,56,64
|`big_uintN_t` |no |big |unsigned |8,16,24,32,40,48,56,64
|`little_intN_t` |no |little |signed |8,16,24,32,40,48,56,64
|`little_uintN_t` |no |little |unsigned |8,16,24,32,40,48,56,64
|`native_intN_t` |no |native |signed |8,16,24,32,40,48,56,64
|`native_uintN_t` |no |native |unsigned |8,16,24,32,40,48,56,64
|`big_intN_at` |yes |big |signed |8,16,32,64
|`big_uintN_at` |yes |big |unsigned |8,16,32,64
|`little_intN_at` |yes |little |signed |8,16,32,64
|`little_uintN_at` |yes |little |unsigned |8,16,32,64
|===
The unaligned types do not cause compilers to insert padding bytes in classes
and structs. This is an important characteristic that can be exploited to
minimize wasted space in memory, files, and network transmissions.
CAUTION: Code that uses aligned types is possibly non-portable because
alignment requirements vary between hardware architectures and because
alignment may be affected by compiler switches or pragmas. For example,
alignment of an 64-bit integer may be to a 32-bit boundary on a 32-bit machine.
Furthermore, aligned types are only available on architectures with 8, 16, 32,
and 64-bit integer types.
TIP: Prefer unaligned arithmetic types.
TIP: Protect yourself against alignment ills. For example:
[none]
{blank}::
+
```
static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong");
```
NOTE: One-byte arithmetic types have identical layout on all platforms, so they
never actually reverse endianness. They are provided to enable generic code,
and to improve code readability and searchability.
## Class template `endian_arithmetic`
`endian_arithmetic` is an integer byte-holder with user-specified endianness,
value type, size, and alignment. The usual operations on arithmetic types are
supplied.
### Synopsis
```
#include <boost/endian/buffers.hpp>
namespace boost
{
namespace endian
{
// C++11 features emulated if not available
enum class align { no, yes };
template <order Order, class T, std::size_t n_bits,
align Align = align::no>
class endian_arithmetic
{
public:
typedef T value_type;
// if BOOST_ENDIAN_FORCE_PODNESS is defined && C++11 PODs are not
// available then these two constructors will not be present
endian_arithmetic() noexcept = default;
endian_arithmetic(T v) noexcept;
endian_arithmetic& operator=(T v) noexcept;
operator value_type() const noexcept;
value_type value() const noexcept;
unsigned char* data() noexcept;
unsigned char const* data() const noexcept;
// arithmetic operations
// note that additional operations are provided by the value_type
value_type operator+() const noexcept;
endian_arithmetic& operator+=(value_type y) noexcept;
endian_arithmetic& operator-=(value_type y) noexcept;
endian_arithmetic& operator*=(value_type y) noexcept;
endian_arithmetic& operator/=(value_type y) noexcept;
endian_arithmetic& operator%=(value_type y) noexcept;
endian_arithmetic& operator&=(value_type y) noexcept;
endian_arithmetic& operator|=(value_type y) noexcept;
endian_arithmetic& operator^=(value_type y) noexcept;
endian_arithmetic& operator<<=(value_type y) noexcept;
endian_arithmetic& operator>>=(value_type y) noexcept;
endian_arithmetic& operator++() noexcept;
endian_arithmetic& operator--() noexcept;
endian_arithmetic operator++(int) noexcept;
endian_arithmetic operator--(int) noexcept;
// Stream inserter
template <class charT, class traits>
friend std::basic_ostream<charT, traits>&
operator<<(std::basic_ostream<charT, traits>& os, const endian_arithmetic& x);
// Stream extractor
template <class charT, class traits>
friend std::basic_istream<charT, traits>&
operator>>(std::basic_istream<charT, traits>& is, endian_arithmetic& x);
};
// typedefs
// unaligned big endian signed integer types
typedef endian_arithmetic<order::big, int_least8_t, 8> big_int8_t;
typedef endian_arithmetic<order::big, int_least16_t, 16> big_int16_t;
typedef endian_arithmetic<order::big, int_least32_t, 24> big_int24_t;
typedef endian_arithmetic<order::big, int_least32_t, 32> big_int32_t;
typedef endian_arithmetic<order::big, int_least64_t, 40> big_int40_t;
typedef endian_arithmetic<order::big, int_least64_t, 48> big_int48_t;
typedef endian_arithmetic<order::big, int_least64_t, 56> big_int56_t;
typedef endian_arithmetic<order::big, int_least64_t, 64> big_int64_t;
// unaligned big endian unsigned integer types
typedef endian_arithmetic<order::big, uint_least8_t, 8> big_uint8_t;
typedef endian_arithmetic<order::big, uint_least16_t, 16> big_uint16_t;
typedef endian_arithmetic<order::big, uint_least32_t, 24> big_uint24_t;
typedef endian_arithmetic<order::big, uint_least32_t, 32> big_uint32_t;
typedef endian_arithmetic<order::big, uint_least64_t, 40> big_uint40_t;
typedef endian_arithmetic<order::big, uint_least64_t, 48> big_uint48_t;
typedef endian_arithmetic<order::big, uint_least64_t, 56> big_uint56_t;
typedef endian_arithmetic<order::big, uint_least64_t, 64> big_uint64_t;
// unaligned big endian floating point types
typedef endian_arithmetic<order::big, float, 32> big_float32_t;
typedef endian_arithmetic<order::big, double, 64> big_float64_t;
// unaligned little endian signed integer types
typedef endian_arithmetic<order::little, int_least8_t, 8> little_int8_t;
typedef endian_arithmetic<order::little, int_least16_t, 16> little_int16_t;
typedef endian_arithmetic<order::little, int_least32_t, 24> little_int24_t;
typedef endian_arithmetic<order::little, int_least32_t, 32> little_int32_t;
typedef endian_arithmetic<order::little, int_least64_t, 40> little_int40_t;
typedef endian_arithmetic<order::little, int_least64_t, 48> little_int48_t;
typedef endian_arithmetic<order::little, int_least64_t, 56> little_int56_t;
typedef endian_arithmetic<order::little, int_least64_t, 64> little_int64_t;
// unaligned little endian unsigned integer types
typedef endian_arithmetic<order::little, uint_least8_t, 8> little_uint8_t;
typedef endian_arithmetic<order::little, uint_least16_t, 16> little_uint16_t;
typedef endian_arithmetic<order::little, uint_least32_t, 24> little_uint24_t;
typedef endian_arithmetic<order::little, uint_least32_t, 32> little_uint32_t;
typedef endian_arithmetic<order::little, uint_least64_t, 40> little_uint40_t;
typedef endian_arithmetic<order::little, uint_least64_t, 48> little_uint48_t;
typedef endian_arithmetic<order::little, uint_least64_t, 56> little_uint56_t;
typedef endian_arithmetic<order::little, uint_least64_t, 64> little_uint64_t;
// unaligned little endian floating point types
typedef endian_arithmetic<order::little, float, 32> little_float32_t;
typedef endian_arithmetic<order::little, double, 64> little_float64_t;
// unaligned native endian signed integer types
typedef endian_arithmetic<order::native, int_least8_t, 8> native_int8_t;
typedef endian_arithmetic<order::native, int_least16_t, 16> native_int16_t;
typedef endian_arithmetic<order::native, int_least32_t, 24> native_int24_t;
typedef endian_arithmetic<order::native, int_least32_t, 32> native_int32_t;
typedef endian_arithmetic<order::native, int_least64_t, 40> native_int40_t;
typedef endian_arithmetic<order::native, int_least64_t, 48> native_int48_t;
typedef endian_arithmetic<order::native, int_least64_t, 56> native_int56_t;
typedef endian_arithmetic<order::native, int_least64_t, 64> native_int64_t;
// unaligned native endian unsigned integer types
typedef endian_arithmetic<order::native, uint_least8_t, 8> native_uint8_t;
typedef endian_arithmetic<order::native, uint_least16_t, 16> native_uint16_t;
typedef endian_arithmetic<order::native, uint_least32_t, 24> native_uint24_t;
typedef endian_arithmetic<order::native, uint_least32_t, 32> native_uint32_t;
typedef endian_arithmetic<order::native, uint_least64_t, 40> native_uint40_t;
typedef endian_arithmetic<order::native, uint_least64_t, 48> native_uint48_t;
typedef endian_arithmetic<order::native, uint_least64_t, 56> native_uint56_t;
typedef endian_arithmetic<order::native, uint_least64_t, 64> native_uint64_t;
// unaligned native endian floating point types
typedef endian_arithmetic<order::native, float, 32> native_float32_t;
typedef endian_arithmetic<order::native, double, 64> native_float64_t;
// aligned big endian signed integer types
typedef endian_arithmetic<order::big, int8_t, 8, align::yes> big_int8_at;
typedef endian_arithmetic<order::big, int16_t, 16, align::yes> big_int16_at;
typedef endian_arithmetic<order::big, int32_t, 32, align::yes> big_int32_at;
typedef endian_arithmetic<order::big, int64_t, 64, align::yes> big_int64_at;
// aligned big endian unsigned integer types
typedef endian_arithmetic<order::big, uint8_t, 8, align::yes> big_uint8_at;
typedef endian_arithmetic<order::big, uint16_t, 16, align::yes> big_uint16_at;
typedef endian_arithmetic<order::big, uint32_t, 32, align::yes> big_uint32_at;
typedef endian_arithmetic<order::big, uint64_t, 64, align::yes> big_uint64_at;
// aligned big endian floating point types
typedef endian_arithmetic<order::big, float, 32, align::yes> big_float32_at;
typedef endian_arithmetic<order::big, double, 64, align::yes> big_float64_at;
// aligned little endian signed integer types
typedef endian_arithmetic<order::little, int8_t, 8, align::yes> little_int8_at;
typedef endian_arithmetic<order::little, int16_t, 16, align::yes> little_int16_at;
typedef endian_arithmetic<order::little, int32_t, 32, align::yes> little_int32_at;
typedef endian_arithmetic<order::little, int64_t, 64, align::yes> little_int64_at;
// aligned little endian unsigned integer types
typedef endian_arithmetic<order::little, uint8_t, 8, align::yes> little_uint8_at;
typedef endian_arithmetic<order::little, uint16_t, 16, align::yes> little_uint16_at;
typedef endian_arithmetic<order::little, uint32_t, 32, align::yes> little_uint32_at;
typedef endian_arithmetic<order::little, uint64_t, 64, align::yes> little_uint64_at;
// aligned little endian floating point types
typedef endian_arithmetic<order::little, float, 32, align::yes> little_float32_at;
typedef endian_arithmetic<order::little, double, 64, align::yes> little_float64_at;
// aligned native endian typedefs are not provided because
// <cstdint> types are superior for that use case
} // namespace endian
} // namespace boost
```
The only supported value of `CHAR_BIT` is 8.
The valid values of `Nbits` are as follows:
* When `sizeof(T)` is 1, `Nbits` shall be 8;
* When `sizeof(T)` is 2, `Nbits` shall be 16;
* When `sizeof(T)` is 4, `Nbits` shall be 24 or 32;
* When `sizeof(T)` is 8, `Nbits` shall be 40, 48, 56, or 64.
Other values of `sizeof(T)` are not supported.
When `Nbits` is equal to `sizeof(T)*8`, `T` must be a standard arithmetic type.
When `Nbits` is less than `sizeof(T)*8`, `T` must be a standard integral type
({cpp}std, [basic.fundamental]) that is not `bool`.
### Members
```
endian_arithmetic() noexcept = default; // C++03: endian(){}
```
[none]
* {blank}
+
Effects:: Constructs an uninitialized object.
```
endian_arithmetic(T v) noexcept;
```
[none]
* {blank}
+
Effects:: See `endian_buffer::endian_buffer(T)`.
```
endian_arithmetic& operator=(T v) noexcept;
```
[none]
* {blank}
+
Effects:: See `endian_buffer::operator=(T)`.
Returns:: `*this`.
```
value_type value() const noexcept;
```
[none]
* {blank}
+
Returns:: See `endian_buffer::value()`.
```
unsigned char* data() noexcept;
```
```
unsigned char const* data() const noexcept;
```
[none]
* {blank}
+
Returns:: See `endian_buffer::data()`.
```
operator T() const noexcept;
```
[none]
* {blank}
+
Returns::
`value()`.
### Other operators
Other operators on endian objects are forwarded to the equivalent operator on
`value_type`.
### Stream inserter
```
template <class charT, class traits>
friend std::basic_ostream<charT, traits>&
operator<<(std::basic_ostream<charT, traits>& os, const endian_arithmetic& x);
```
[none]
* {blank}
+
Returns:: `os << +x`.
[none]
### Stream extractor
```
template <class charT, class traits>
friend std::basic_istream<charT, traits>&
operator>>(std::basic_istream<charT, traits>& is, endian_arithmetic& x);
```
[none]
* {blank}
+
Effects:: As if:
+
```
T i;
if (is >> i)
x = i;
```
Returns:: `is`.
## FAQ
See the <<overview_faq,Overview FAQ>> for a library-wide FAQ.
Why not just use Boost.Serialization?::
Serialization involves a conversion for every object involved in I/O. Endian
integers require no conversion or copying. They are already in the desired
format for binary I/O. Thus they can be read or written in bulk.
Are endian types PODs?::
Yes for {cpp}11. No for {cpp}03, although several
<<arithmetic_compilation,macros>> are available to force PODness in all cases.
What are the implications of endian integer types not being PODs with {cpp}03 compilers?::
They can't be used in unions. Also, compilers aren't required to align or lay
out storage in portable ways, although this potential problem hasn't prevented
use of Boost.Endian with real compilers.
What good is native endianness?::
It provides alignment and size guarantees not available from the built-in
types. It eases generic programming.
Why bother with the aligned endian types?::
Aligned integer operations may be faster (as much as 10 to 20 times faster)
if the endianness and alignment of the type matches the endianness and
alignment requirements of the machine. The code, however, will be somewhat less
portable than with the unaligned types.
Why provide the arithmetic operations?::
Providing a full set of operations reduces program clutter and makes code
both easier to write and to read. Consider incrementing a variable in a record.
It is very convenient to write:
+
```
++record.foo;
```
+
Rather than:
+
```
int temp(record.foo);
++temp;
record.foo = temp;
```
## Design considerations for Boost.Endian types
* Must be suitable for I/O - in other words, must be memcpyable.
* Must provide exactly the size and internal byte ordering specified.
* Must work correctly when the internal integer representation has more bits
that the sum of the bits in the external byte representation. Sign extension
must work correctly when the internal integer representation type has more
bits than the sum of the bits in the external bytes. For example, using
a 64-bit integer internally to represent 40-bit (5 byte) numbers must work for
both positive and negative values.
* Must work correctly (including using the same defined external
representation) regardless of whether a compiler treats char as signed or
unsigned.
* Unaligned types must not cause compilers to insert padding bytes.
* The implementation should supply optimizations with great care. Experience
has shown that optimizations of endian integers often become pessimizations
when changing machines or compilers. Pessimizations can also happen when
changing compiler switches, compiler versions, or CPU models of the same
architecture.
## Experience
Classes with similar functionality have been independently developed by
several Boost programmers and used very successful in high-value, high-use
applications for many years. These independently developed endian libraries
often evolved from C libraries that were also widely used. Endian types have
proven widely useful across a wide range of computer architectures and
applications.
## Motivating use cases
Neil Mayhew writes: "I can also provide a meaningful use-case for this
library: reading TrueType font files from disk and processing the contents. The
data format has fixed endianness (big) and has unaligned values in various
places. Using Boost.Endian simplifies and cleans the code wonderfully."
## {cpp}11
The availability of the {cpp}11
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
Functions] feature is detected automatically, and will be used if present to
ensure that objects of `class endian_arithmetic` are trivial, and thus PODs.
## Compilation
Boost.Endian is implemented entirely within headers, with no need to link to any
Boost object libraries.
Several macros allow user control over features:
* BOOST_ENDIAN_NO_CTORS causes `class endian_arithmetic` to have no
constructors. The intended use is for compiling user code that must be portable
between compilers regardless of {cpp}11
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
Functions] support. Use of constructors will always fail,
* BOOST_ENDIAN_FORCE_PODNESS causes BOOST_ENDIAN_NO_CTORS to be defined if
the compiler does not support {cpp}11
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
Functions]. This is ensures that objects of `class endian_arithmetic` are PODs,
and so can be used in {cpp}03 unions. In {cpp}11, `class endian_arithmetic`
objects are PODs, even though they have constructors, so can always be used in
unions.
## Acknowledgements
Original design developed by Darin Adler based on classes developed by Mark
Borgerding. Four original class templates combined into a single
`endian_arithmetic` class template by Beman Dawes, who put the library together,
provided documentation, added the typedefs, and also added the
`unrolled_byte_loops` sign partial specialization to correctly extend the sign
when cover integer size differs from endian representation size.