Date: Tue, 14 Nov 2006 06:15:37 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: "M. Warner Losh" <imp@bsdimp.com> Cc: jkoshy@freebsd.org, src-committers@freebsd.org, cvs-all@freebsd.org, cvs-src@freebsd.org Subject: Re: cvs commit: src/include ar.h Message-ID: <20061114050457.M77914@delplex.bde.org> In-Reply-To: <20061113.101808.-1540392146.imp@bsdimp.com> References: <200611130428.kAD4ST0U093715@repoman.freebsd.org> <20061113173927.Q75708@delplex.bde.org> <20061113.101808.-1540392146.imp@bsdimp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 13 Nov 2006, M. Warner Losh wrote: > In message: <20061113173927.Q75708@delplex.bde.org> > Bruce Evans <bde@zeta.org.au> writes: > : In <ar.h>, all struct members are char arrays so there will be no > : padding in practice. > > Actually, that isn't the case at all. There are many systems that > pad/align structs to 4 byte boundaries regardless. Only 1 broken one has turned up so far. Even it only pads at the end. > NetBSD has many of > these places tagged with __packed since they run on more architectures > than they can hand tweek for. This moves further from actually fixing the problem. > : Most system headers depend on the padding being > : the same for all compilers, even when the struct member types are very > : varied. Many system headers use explicit manual padding to try to > : give a fixed layout. This works in practice, and the not-unused file > : system ffs and the not-unused networking component netinet depend on > : it working. Most file systems probably depend on it working across > : ... > > "almost" "in theory". These are nice words, but don't match reality. > On the arm, the __packed is absolutely required to not only compile, > but produce correct code. I haven't seen any examples where it is more than a hack to make broken code compile. An unpacked struct ar is perfectly portable, but might not match the layout in the file. The broken code that started the thread in -current uses an invalid cast to convert &byte_array[offset] into a "struct ar *". This bug was detected accidentally because the ARM's struct ar has stricter alignment requirements than char. Non- broken code knows that byte-aligned data needs to be copied into structs using memcpy(). E.g., in ping/ping.c there are several such copies for things like network addresses and timevals. It would be very wrong to declare struct timeval as __packed, although it needs to be packed to work. Declaring it as both __packed and __aligned(max(sizeof(time_t), sizeof(long)) might work, but is too ugly for me. Strictly, all structs used in ABIs would have to be declared like this (after adding explicit padding where needed, and lots of ifdefs for the padding as in <sys/user.h>) to prevent them depending on the compiler of the day. > : netinet uses __packed a bit, but this just gives unportability and > : pessimizations in practice. E.g., struct ip is manually packed and > : has a natural 4-byte alignment requirement, but declaring it as __packed > : reduces its alignment requirement to 1 byte; thus in internal structs > : like "struct x { int16_t x_foo; struct ip x_ip; };", the x_ip is > : normally misaligned on a 16-bit boundary and thus: > > struct ip should be packed. It is the wire format of a data > structure, and the compiler is free to add extra padding that you > don't think is necessary. It needs to be packed when sent to the wire, but not necessarily in memory. The copy in memory certainly shouldn't be just __packed, since __packed breaks its alignment. As above, declaring it as __packed and __aligned(4) might work, but is too ugly for me. > : - the layout of the internal struct is unportable in practice instead of > : unportable in theory > : - compilers for arches with strict alignment requirements have to > : generate extra instructions to access the 32-bit members in struct > : ip (just 2 in_addr_t's), so the accesses are pessimized and not > : atomic. gcc on i64's seems to do this correctly. The pessimization > : is by a factor of 14 according to the instruction count (a single > : 32-bit load instruction for an in_addr_t gets replaced by 4 8-bit > : load instructions and 10 more instructions to combine the bytes). > : - code like "struct x *xp; ... foo(&xp->x_ip.ip_src);" is broken on > : on arches with strict alignment requirements. After the address of > : the in_addr_t is passed to another function, the compiler can no > : longer easiliy see that it has to generate pessimal loads for accesses > : to the in_addr_t. A variant of this bug occurred in practice in > : rev.1.18 of <netipx/ipx.h>. There, almost eveything is naturally > : aligned to a 16-bit boundary, and __packed is misused to give 8-bitr > : alignment as a bad side effect. Rev.1.18 added a 32-bit member in > : one of the structs contained in the __packed struct. This should > : have just broken binary compatibility and might not have mattered > : depending on how internal the __packed struct is. Instead, it gave > : misaligned 32-bit values instide the __packed struct, thus defeating > : the reason for existence of rev.1.18, since the reason was to add a > : union hack to allow accessing "char c_net[4];" as "u_int u_net;", but > : union hacks like this depend on not using __packed in any struct > : containing c/u_net (recursively). > > There's no guarantee that struct ip is 4 byte aligned from the > drivers. In fact, there are several that require the packet to start > on a 4 byte boundary and lack scatter gather, thus guaranteeing that > struct ip is not 4 byte aligned. This works well for almost all the > code in the tree (the one that it doesn't work for is in the nfs root > code somewhere). Yes there is. It contains some uint32_t's (in_addr_t's), so it is guaranteed to be 4-byte aligned if it is actually a struct. Declaring it as __packed without declaring it as __aligned(4) makes it a non-struct (sort of). Drivers need to copy it to and from the wire so that code all over the network stack isn't pessimized by misaligned accesses or extra instructions to avoid misaligned accesses. I think some drivers do this. It isn't declared as _packed in RELENG_4, so most drivers probably get this right. RELENG_4 has alpha so the alignment is not completely untested in it. > : <netinet/ip.h> was broken by adding __packed in the same commit that > : removed an unportable bitfield from it. The bitfield was the actual > : problem. Bitfields should never be used in structs that have to match > : an externally imposed layouts since they are unportable in practice. > : gcc's bitfields are especially unportable. E.g. "int:8" gives the > : same alignment requirement as int, and this is undocumented. In gcc, > : you have to use "char:8" or __packed to get 8-bit alignment of structs > : containing bitfields, but these are constraint and syntax errors in > : C. > : > : Using __packed in <ar.h> isn't as bad as in <netinet/ip.h> since struct > : ar has a natural 1-byte alignment requirement so using __packed is a > : no-op in practice. > > Except on the arm, where it is critically required for the generated > code to be correct. I doubt that it is needed for more than breaking the warning. You would have to be unlucky or shooting your feet for "(struct ar *)&byte_array[offset]" to give a pointer that is actually misaligned. Large arrays of chars are normally aligned to the word size even if they are on the stack, and the archive header is at offset 0 in files. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061114050457.M77914>