Date: Tue, 28 Mar 2017 16:53:43 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Warner Losh <imp@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r316064 - head/sys/boot/i386/boot2 Message-ID: <20170328141213.T927@besplex.bde.org> In-Reply-To: <201703272253.v2RMra2L032487@repo.freebsd.org> References: <201703272253.v2RMra2L032487@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 27 Mar 2017, Warner Losh wrote: > Log: > Fix build with path names with 'align' or 'nop' in them. > > clang is now inserting .file directives with the entire path in > them. This is fine, except that our sed peephole optimizer removes > them if ${SRCTOP} or ${OBJTOP} contains 'align' or 'nop', leading to > build failures. The sed peephole optimizer removes useful things for > boot2 when used with clang, so restrict its use to gcc. Also, gcc no > longer generates nops to pad things, so there's no point in removing > it. Specialize the optimization to just removing the .align 4 lines to > preclude inadvertant path matching. > > Sponsored by: Netflix > Commit brought to you the path: /home/xxx/NCD-3592-logsynopts/FreeBSD > > Modified: > head/sys/boot/i386/boot2/Makefile > > Modified: head/sys/boot/i386/boot2/Makefile > ============================================================================== > --- head/sys/boot/i386/boot2/Makefile Mon Mar 27 22:34:43 2017 (r316063) > +++ head/sys/boot/i386/boot2/Makefile Mon Mar 27 22:53:36 2017 (r316064) > @@ -91,10 +91,18 @@ boot2.o: boot2.s > > SRCS= boot2.c boot2.h > > +# Gcc (4.2.1 at least) benefits from removing the forced alignment > +# clang doesn't. Make the removal as specific as possible to avoid > +# false positives (like path names with odd names for debugging info). > +# In the past, gcc benefited from nop removal, but not in 4.2.1. > +# Think of this as a poor-man's peephole optimizer for gcc 4.2.1 > boot2.s: boot2.c boot2.h ${.CURDIR}/../../common/ufsread.c > ${CC} ${CFLAGS} -S -o boot2.s.tmp ${.CURDIR}/boot2.c > - sed -e '/align/d' -e '/nop/d' < boot2.s.tmp > boot2.s > - rm -f boot2.s.tmp > +.if ${COMPILER_TYPE} == "gcc" > + sed -e '/\.align 4/d' < boot2.s.tmp > boot2.s > +.else > + cp boot2.s.tmp boot2.s > +.endif > > boot2.h: boot1.out > ${NM} -t d ${.ALLSRC} | awk '/([0-9])+ T xread/ \ That's silly; if you just fix the regexps to actually be as specifice as possible, then they would just work with no need for large comments or ifdefs. '\.align' instead of 'align' is a good start. ' 4' instead of nothing is not so good. There should also be a check for nothing except whitespace before '\.align'. The whitespace before '4' might be a tab, but perhaps gcc only generates a space, and only generates unwanted alignments of precisely 4, with no fill bytes. The general syntax would be too hard to parse, but it is easy to handle any digit, and also '\.p2align', and allow nothing except whitespace after the digit. Last time I looked at this about 5 years ago, but after clang, the whole optimization step was silly. It made little difference for gcc and no difference for clang as you say. The nop removal that you removed was silliest. I think compilers stopped generating nops for padding many years before that code was written. In text sections, They should generate something like '.p2align x,y,z', where x is the alignment, y is empty (it is the fill byte, which defaults to nop, but we don't want that but want a larger null instruction for multiple bytes) and z is a limit on the amount of padding (x is only a preference and z limits it). This syntax would be hard to parse. But CFLAGS has -Os, and that works for preventing padding in the text section, so nop is never generated, not even with the spelling 'align'. -Os doesn't work so well for other things. For kernels, it is completely broken with gcc-4.2.1 unless -fno-inline-functions is also used (it breaks the inlining parameters whose misconfiguration is probably the actual bug. This doesn't seem to be a problem for boot code. IIRC, the only simply broken thing is that alignment is still done for variables. This generates ".align 4". -Os is differently broken for clang, so the simple editing doesn't help. -Wa,-al to see what the assembler is doing is broken for clang. -mpreferred-stack-boundary doesn't work for clang, so is not used, but clang does better stack alignment by default. This might still waste some instructions, but not for every function. -mno-align-long-strings doesn't work for clang or for gcc > 4.2.1, so is not used. clang does quite different string padding by default, and IIRC it is worse. But boot2 hardly has any long strings. boot code is hard to test since it is not modular. This is much more broken than when I last tested the size of boot2.o. The most obvious new bugs are: - something like SRCTOP breaks finding the relative path to ufs headers in a FreeBSD-11 sys tree - ${CLANG_OPT_SMALL} is added for all compilers. This is defined out-of-tree in /usr/share/mk/bsd.sys.mk. (I tested on a host running nearly -current with a nearly FreeBSD-10 sys tree). In boot2, CLANG_P{T_SMALL is wrong even for clang since it contains -mstack-alignment=8 unconditionally, but the stack alignment should be 4 or even 1 with -m32 -Os. Fixing this makes no difference for boot2.o compiled by clang. The corresponding -mpreferred-stack-boundary=2 for gcc is configured correctly in boot2/Makefile. After fixing this, simply "CC=gcc-4.2.1 make boot2.o". Adding -Wa,-al to CC also works (-Wa,-al is broken for clang, and clang produces a boot2.s which cannot be assembled by gas). The file sizes for FreeBSD-10 are: text data bss dec hex filename 5126 12 4940 10078 0x275e clang/boot2-unhacked.o 5126 9 4940 10075 0x275b clang/boot2.o 5036 32 4928 9996 0x270c gcc/boot-unhacked.o 5032 29 4928 9989 0x2705 gcc/boot2.o In other words, the editing saves a whole 4 bytes of text and 3 bytes of data for gcc. It also saves a whole 3 bytes of data for clang, and you just broke this :-). (I would be surprised if the data isn't padded back to a multiple of 4 anyway.) After the reductions, the result with gcc is 86 bytes smaller than with clang, instead of only 82 bytes smaller. It is silly doing such small optimizations specially. The editing actually removes 22 alignment statements, all spelled "^<tab>\.align<space>4$'. All of these are for data. -Wa,-al shows that the padding is 2 instances of 2 bytes and 1 of 3 bytes. At least half of the padding is for pessimal layout which is also a style bug in the source code. Arrays of characters are mixed with 32-bit variables. Style(9) says to sort on size (it means on alignment/ element size), to minimize padding (mainly for structs) and get a conistent order as a side effect. The padding usually doesn't matter except for structs. After fixing the source code, there would be just 1 odd byte in a character array that normally causes 3 bytes of padding after it. It takes considerably more magic to avoid getting this padding anyway in the link step, In the old a.out boot blocks, the linker forces 16-byte alignment at the end of every object file. The old boot blocks put most of the data in a single object file to avoid an aerage of 8 bytes of padding between files. Elf is more flexible, and usually gives 4-byte alignment. I think it is possible to match an odd byte at the end of 1 object file with an odd 3 bytes at the beginning of the next object file, but this usually doesn't happen. The new boot2 uses hacks like including ufsread.c to avoid having lots of object files. I use the old a.out boot blocks updated to elf and EDD and have squeezed them more extensively. The squeezing is also silly since I could very easily switch to 64K ones (much larger than that is not easy, since they have to fit below 640K and a few multiples of 64K are already used for buffers). The limit on 8K is mainly a historical mistake. A limit of 7.5K simplified booting from 15-sector floppies. 18-sector floppies allowed easy expansion to 9K, but were unportable for a small gain. The default partition layout left only 8K below the ffs superblock, but was only a default. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170328141213.T927>