Date: Sun, 11 Nov 2012 20:26:18 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Dimitry Andric <dim@freebsd.org>, Nathan Whitehorn <nwhitehorn@freebsd.org> Subject: Re: svn commit: r242835 - head/contrib/llvm/lib/Target/X86 Message-ID: <20121111182618.GV73505@kib.kiev.ua> In-Reply-To: <20121112014417.O1675@besplex.bde.org> References: <201211091856.qA9IuRxX035169@svn.freebsd.org> <509F2AA6.9050509@freebsd.org> <20121111214908.P938@besplex.bde.org> <509FB35F.1010801@FreeBSD.org> <20121112014417.O1675@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--1PQyblS8X/PkxCw+ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Nov 12, 2012 at 02:01:47AM +1100, Bruce Evans wrote: > On Sun, 11 Nov 2012, Dimitry Andric wrote: >=20 > > On 2012-11-11 12:53, Bruce Evans wrote: > >>=20 > >> I'm not sure if either of us knows exactly what this does, but I like > >> this change. clang does stack alignment correctly, so it doesn't need > >> the gcc pessimization of aligning in the caller. clang aligns in the >=20 > Apparently we did know. >=20 > >> ... > >> alignment. gcc's alignment in callers doesn't even work, because gcc > >> assumes that it is enough and never does it in callees even when it > >> is necessary: > >>=20 > >> auto int32_t foo[8] __aligned[32]; > >>=20 > >> gcc fails to do the necessary alignment. clang does it. > >>=20 > >> auto int32_t foo[8] __aligned[16]; > >>=20 > >> Both gcc and (hopefully only without the above fix) clang fail to do t= he > >> necessary alignment. > > > > It works just fine now with clang. For the first example, I get: > > > > pushl %ebp > > movl %esp, %ebp > > andl $-32, %esp > > > > as prolog, and for the second: > > > > pushl %ebp > > movl %esp, %ebp > > andl $-16, %esp >=20 > Good. >=20 > The andl executes very fast. Perhaps not as fast as subl on %esp, > because subl is normal so more likely to be optimized (they nominally > have the same speeds, but %esp is magic). Unfortunately, it seems to > be impossible to both align the stack and reserve some space on it in > 1 instruction -- the andl might not reserve any. I think the biggest hit from the andl instruction is due to the spoiling of the stack engine presented on all Intel processors starting from Pentium Pro. Most likely, predictor cannot handle such change of %esp without throwing the hands up. >=20 > >> special alignment in main(), but assumes that crtso did it. clang also > >> doesn't support -mpreferred-stack-boundary, so it is incompatible with > >> nonstandard ABI's like the one given by always using > >> -mpreferred-stack-boundary=3D32. > > > > Apparently upstream never saw the need for this option. I strongly > > doubt it is used very often outside FreeBSD... >=20 > It is most useful for working around gcc's behaviour. Hmm, the kernel > and boot blocks uses -mpreferred-stack-boundary=3D2 on i386 to save space. > This must have been turned off for clang, so the versions that did > 16-bit alignment must have come close to blowing the kernel stack. In > boot blocks, I think there is plenty of stack but alignment wastes code > space. >=20 > I'd like to try -mpreferred-stack-boundary=3D3 in amd64 kernels, but this > is an i386-only option. >=20 > >> Yes, we need clang/our libraries to handle different alignments and > >> not assume that callers do more than the ABI requires and pessimize > >> on behalf of the libraries. Outside of libraries, the problem is small > >> provided -mpreferred-stack-boundary works, since you can compile > >> everything with the same -mpreferred-stack-boundary. > > > > As far as I can see, with the 4 byte stack alignment that has been the > > default, and is now also clang's default, our libraries should handle > > any "incoming" stack alignment of >=3D 4 bytes. I don't think anybody > > uses lower stack alignment, except maybe for the special case of boot > > loaders, or extremely size-optimized code. The i386 ABI requires 4 byte alignment, but does not require any larger. I do not see how can we even pretend to support stable ABI while allowing such gratitious changes. >=20 > -Os could reasonably generate lower stack alignment, but that rarely > saves space. -Os generally doesn't give enough control over alignment > for space-time tradeoffs. I made some minor changes in FreeBSD's gcc > to make it _not_ give misaligned 32-bit variables (?) since -Os should > only optimize for space if it doesn't cost much time for small savings > in space. But if you only care about space, it would be useful to > minimize the alignment of everything. >=20 > Bruce --1PQyblS8X/PkxCw+ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCf7coACgkQC3+MBN1Mb4hhdwCgqXoBIt8uSJE+77MFXCnwseaS 9O0AoOwyODi7SP9F6I0yWLhr0dj/WTLk =t/VH -----END PGP SIGNATURE----- --1PQyblS8X/PkxCw+--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121111182618.GV73505>