Date: Tue, 14 Nov 2006 22:27:49 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: Joseph Koshy <jkoshy@freebsd.org> Cc: cvs-src@freebsd.org, src-committers@freebsd.org, cvs-all@freebsd.org, "M. Warner Losh" <imp@bsdimp.com> Subject: Re: cvs commit: src/include ar.h Message-ID: <20061114210148.F1966@delplex.bde.org> In-Reply-To: <c5c758400611131931v5e8990aeyd5f786a1859227b0@mail.gmail.com> References: <200611130428.kAD4ST0U093715@repoman.freebsd.org> <20061113173927.Q75708@delplex.bde.org> <20061113.101808.-1540392146.imp@bsdimp.com> <20061114050457.M77914@delplex.bde.org> <c5c758400611131931v5e8990aeyd5f786a1859227b0@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 14 Nov 2006, Joseph Koshy wrote: > bde> Non-broken code knows that byte-aligned data needs to be copied > bde> into structs using memcpy(). > > Do keep in mind that this is `struct ar_hdr' we are talking about, not > something else. > > Having to use memcpy() to copy from a collection of ASCII strings (in file) > to a collection of ASCII strings (in struct ar_hdr) to cope with alignment > issues is odd. I almost didn't reply because this seemed to make the change have no effect except to break ar.h for unsupported compilers. It turned into an argument about __packed in general and struct ip in particular. > To be truly portable, code needs to memcpy() in each ASCII string member > of `struct ar_hdr' separately since we cannot make assumptions about > the file and memory layout being identical. More than that :-). The code needs to do something magic to fill the holes, it any. Kernel code is supposed to prezero using memcpy or malloc(..., M_ZERO) to avoid copying out insecurities in the holes. > Needless to say no code in our tree does this today. > > So we need to look at the original intent of why a `struct ar_hdr' was > declared as a collection of fixed size ASCII strings. My guess is that it > was written that way to be able to directly describe its file layout: ASCII > for portability and fixed size char[] arrays to avoid issues with structure > padding. Declaring it as __packed informs today's compilers of this > intent. It also brings down the alignment requirements for `struct ar_hdr' > on the ARM to match that of other platforms as a side-effect. The alignment requirement is actually the crtical one. This should be explicit (if the packing requirement is). > bde> I doubt that it is needed for more than breaking the warning. You > bde> would have to be unlucky or shooting your feet for > bde> "(struct ar *)&byte_array[offset]" to give a pointer that is actually > bde> misaligned. Large arrays of chars are normally aligned to the word > bde> size even if they are on the stack, and the archive header is at offset > bde> 0 in files. > > In an ar(1) archive, there is a `struct ar_hdr' before each archive member. > The only 'alignment' expected for this structure is that of falling on > even offsets in the file. This is only enforced programmatically though. Oops, I grepped for ARMAGIC instead of ARFMAGIC, so I didn't find any file headers. Further testing showed: (1) When __packed breaks the alignment as in struct ip, the compiler doesn't report lost of alignment on taking addresses, but on an ia64 machine in the FreeBSD cluster, the broken alignment doesn't cause problems and in fact is faster: struct foo { char c; int i; } __packed; struct foo foo; int access_int(int *ip) { return (*ip); } int main(void) { #ifdef BROKEN return (access_int(&foo.i)); #else return (foo.i); #endif } In the !BROKEN case, gcc on ia64 generates large code to load foo.i a byte at a time. In the BROKEN case the code just passes a misaligned pointer to access_int(). This should be an error, since &foo.i has the alignment of a char and the cast that changes its alignment to that of an int is only implicit, but I can't find even a warning flag for this. However, on the FreeBSD cluster machine, the misaligned access doesn't trap, and doing the access in a loop shows that the misaligned access is much faster than the correct code that does the access a byte at a time, much the same as on an i386. (2) gcc still has alignment bugs even on i386's. Even i386's things should be aligned for efficiency, and the compiler shouldn't generate any misaligned access for portable code, since if it does then it is generating inefficient code which breaks detection of user misalignment via the i486 alignment check feature. I tried using this feature when i486's first came out, but it didn't work because gcc generates misaligned code for struct copies: struct { uint8_t x; struct { uint8_t y[15]; } ssc; uint32_t z; } x, y; int main(void) { x.ssc = y.ssc; return (0); } Here the uint32_t and no __packed forces the struct to be 4-byte aligned, so ssc is known to be perfectly misaligned to an address equal to 1 mod 4. The struct copy of ssc is done inline and should consist of copying the first byte, then a 2-byte word, then 3 4-byte words. Instead, it is done by copying 3 perfectly misaligned 4-byte words, then 1 misaligned 2-byte word, then the last byte. Thus the struct copy is inefficient and always causes alignment check traps if the alignment check flag is set. The code generated by gcc for this hasn't changed much since i486's first came out. There is one helpful change: gcc now calls memcpy() if the struct size is larger than about 64, but when i486's came out it generated the misaligned inline copy for much larger structs (IIRC, for any size). gcc on i386's also generates inline copies 4 bytes at a time starting at the first byte, for structs that might only be 1-byte aligned: struct { struct { uint8_t y[60]; } nnssc; } x, y; Now the alignment cannot be known for sure, but 4-byte alignment can usually be arranged at little cost for non-small structs, so without alignment checking the best copying method for medium-sized structs is to assume that they are 4-byte aligned and use 4-byte accesses. With alignment checking, this is only best if there is a trap handler to fix up the alignement and the alignment traps rarely occur, Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061114210148.F1966>