Wireshark-dev: Re: [Wireshark-dev] C Bitfield types and alignment
From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Mon, 5 Dec 2011 18:50:10 -0800
On Dec 5, 2011, at 6:22 PM, Jeff Morriss wrote:

> In the 'aaa' structure the compiler will (normally) insert 3 bytes of padding between 'field' and the bitfields so that the guint is 4-byte aligned; improper alignment isn't allowed on some CPUs (like SPARC) and is slow (often/always? requiring 2 memory access cycles to read) on others.

However, accesses to "align" and "size" shouldn't require an unaligned access - each is only 4 bits, so they should fit nicely into a byte following "field".  To quote the C90 specification:

	A member of a structure or union may have any object type. In addition, a member may be declared to consist of a specified number of bits (including a sign bit, if any). Such a member is called a bit-field; its width is preceded by a colon.

	A bit-field shall have a type that is a qualified or unqualified version of one of int, unsigned int, or signed int. Whether the high-order bit position of a (possibly qualified) “plain” int bit-field is treated as a sign bit is implementation-defined. A bit-field is interpreted as an integral type consisting of the specified number of bits.

	An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. *The alignment of the addressable storage unit is unspecified.*

	A bit-field declaration with no declarator, but only a colon and a width, indicates an unnamed bit-field. As a special case of this, a bit-field structure member with a width of 0 indicates that no further bit-field is to be packed into the unit in which the previous bit-field, if any, was placed.

so a C compiler is perfectly within its rights to stuff them both into a single byte immediately following "field".  Given the "unspecified", however, it's also within its rights to do something else and, in either case, not to document what it does.

Given the number of appearances of "implementation-defined" and "unspecified" in that text, I'd say:

	Shorter C90 Specification: C bit-fields are a tool of Satan or, at least, not to be used if you care how stuff is stored in memory.

I suspect C99 says much the same thing, not that we can rely on C99 in any case, given that MSVC++ doesn't implement it.

Given that we're using this to pick apart data structures in packets, we shouldn't use bitfields.