Date: Sat, 21 Mar 2015 16:23:18 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, John Baldwin <jhb@freebsd.org> Subject: Re: svn commit: r280279 - head/sys/sys Message-ID: <20150321142318.GI2379@kib.kiev.ua> In-Reply-To: <20150321205449.H1846@besplex.bde.org> References: <201503201027.t2KAR6Ze053047@svn.freebsd.org> <20150320130216.GS2379@kib.kiev.ua> <20150321085923.U1046@besplex.bde.org> <20150321005456.GC2379@kib.kiev.ua> <20150321205449.H1846@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 21, 2015 at 09:33:22PM +1100, Bruce Evans wrote: > On Sat, 21 Mar 2015, Konstantin Belousov wrote: > How does the unconditional use of popcntq (in inline asm that is called > from amd64 pmap) even work? It is not unconditional. The pmap code does the following: if ((cpu_feature2 & CPUID2_POPCNT) == 0) { free = popcnt_pc_map_elem(pc->pc_map[0]); ... } else { free = popcntq(pc->pc_map[0]); ... } > > jhb's cleanups apparently missed this inline asm, while previous work > should not have added it. This inline asm should never have existed, > since compilers can generate the same code with less-incorrect > configuration. Correct configuration would be messy. jhb's cleanups > depend on a compiler predefine (__POPCNT__) to determine if the > instruction can be used. But this definition is static and is usually > wrong, since compilers default to plain amd64 which doesn't have the > instruction. Dynamic configuration would have to use cpuid just to > determine if the instruction exists. Exactly, this is what supposed to be done, and is done in the pmap. It is fine for somebody targeting specific machine to change -march, but it cannot be used to optimize the generic kernel which is shipped. > > >>> Jilles noted that gcc head and 4.9.2 already provides a workaround by > >>> xoring the dst register. I have some patch for amd64 pmap, see the end > >>> of the message. > > I thought that that couldn't work since it uses the instruction > unconditionally. But the old code already does that. > > >> IIRC, then patch never never uses asm, but intentionally uses the popcount > >> builtin to avoid complications. > > popcntq() in amd64/include/cpufunc.h does use asm, but is missing the > complications needed for when it the instruction is not available. > All callers (just 3 in pmap) seem to be broken, since they are also > missing the complications. See above. > > Still, it is weird to provide functions from the sys/types.h namespace, > > and even more weird to provide such special-purpose function. > > Not in the sys/types.h, but in the implementation namespace :-). Since > no one should depend on the implementation details, we can move the > details to a better place when one is known. Better places would be > something like libkern for userland to hold a large collection of > functions, and a Standard place for single functions. I expect the > Standard place will be broken as designed for compatibility. Probably > strings.h alongside ffs(). strings.h sounds less surprising than sys/types.h > > The special (inline/asm) functions are in sys/types.h on amd64 are > currently limited to just 3 bswap functions and some new popcnt functions.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150321142318.GI2379>