Date: Tue, 4 Nov 2014 12:31:36 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: Hans Petter Selasky <hps@selasky.org>, Mateusz Guzik <mjguzik@gmail.com>, Mateusz Guzik <mjg@freebsd.org>, jmallett@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, svn-src-head@freebsd.org, Julian Elischer <julian@freebsd.org> Subject: Re: svn commit: r274017 - head/sys/kern Message-ID: <20141104103136.GS53947@kib.kiev.ua> In-Reply-To: <20141104045159.E1605@besplex.bde.org> References: <201411030746.sA37kpPu037113@svn.freebsd.org> <54573AEE.9010602@freebsd.org> <54573B87.7000801@freebsd.org> <54573CD2.1000702@selasky.org> <20141103092132.GH29497@dft-labs.eu> <20141103100847.GK53947@kib.kiev.ua> <20141104045159.E1605@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Nov 04, 2014 at 06:03:29AM +1100, Bruce Evans wrote: > Why not just use a C99 VLA? It doesn't require any compiler magic except > being a C99 compiler (I haven't seen any of those yet, but some approximate > C99 for VLAs). I use this in delayed signal delivery code in libthr. > > You (kib) didn't like using alloca() when I tried to talk you into using > it to get aligned structs for SSE long ago. __builtin_alloca() has more > chance of forcing the alignment than VLA,s but this is fragile. FreeBSD > still uses -mpreferred-stack-boundary on i386 with gcc. This option is > broken (unsupported) with clang, but is less needed since clang does > alignment better. You did convinced me to use alloca for transient copy of extended FPU state in signal code. > How do the __builtin_alloca()'s in your FP code (recently synced to i386 > where alignment is more difficult by jhb) work? I think they give > 16-byte alignment in all cases, but some cases seem to need 64-byte > alignment and do this by hand. Old code seems to be little changed, > so I think it still does 16-byte alignment by hand. It does not require any alignment. All FPU accesses (storing and loading the state) happen to the properly aligned save area below the pcb for userspace state. Apart from handling alignment, it is also required (or hard to make any other way) to make XSAVEOPT optimization working. The allocated buffer is used only to put together the scattered pieces before copyout or after copyin.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141104103136.GS53947>