Exploring Endianness (Octet Ordering)

As I get deeper into my computer science studies, I'm learning about Endianness.

More on this subject here:

Endianness is the ordering of octets in memory and ISAs (instruction set architecture), but not CPU registers. Big-endian stores the most significant byte first, while little-endian stores the least significant byte first.

Big Endian - Little Endian

Bugs can occur when you assume a specific data size, say long vs long long, and then access and store data inconsistently (using different sizes). Sometimes the underlying octet order can give you the impression your code stored and accessed correctly. This can happen in the case where you store 4 bytes but access only 2.

Stay away from casting types when reading from memory addresses with pointers. Usually this indicates you aren't using the correct type in the first place. reinterpret_cast should definitely avoided, but if you are really sure you know what you're doing, wrap access and storage into getters and setters. Then if you find out you made a mistake, you only have to fix it in one place.

Text files usually don't pose risk. Binary files, on the other hand, can cause bugs when the underlying endianness is misunderstood, or worse, assumed instead of known. Choose an octet ordering and stick to it.

Use endian.h. Its functions convert the byte encoding of integer values from the byte order that the current CPU (the "host") uses, to and from little-endian and big-endian byte order.

Consistency and strong types will win the day!

Which architectures are Big-endian?

  • Motorola 68000 series (including Freescale ColdFire)
  • Xilinx Microblaze
  • SuperH
  • IBM z/Architecture
  • Atmel AVR32

Which architectures are Little-endian?

  • Intel x86 and x86-64 series of processors, therefore known as the "Intel convention"
  • MOS Technology 6502 (including Western Design Center 65802 and 65C816)
  • Zilog Z80 (including Z180 and eZ80)
  • Altera Nios II

Fields in the protocols of the Internet protocol suite, such as IPv4, IPv6, TCP, and UDP, are transmitted in big-endian order. For this reason, big-endian byte order is also referred to as network byte order.

comments powered by Disqus