Date: Wed, 23 Nov 2011 10:37:14 +0000 From: David Chisnall <theraven@FreeBSD.org> To: David Schultz <das@FreeBSD.org> Cc: src-committers@FreeBSD.org, Eitan Adler <eadler@FreeBSD.org>, svn-src-all@FreeBSD.org, dim@FreeBSD.org, Brooks Davis <brooks@FreeBSD.org>, bde@FreeBSD.org, svn-src-head@FreeBSD.org Subject: Re: svn commit: r227812 - head/lib/libc/string Message-ID: <0DC88C34-91B4-49D1-AA8A-73B14C99D35B@FreeBSD.org> In-Reply-To: <20111122202735.GA21442@zim.MIT.EDU> References: <201111220250.pAM2oPWC070856@svn.freebsd.org> <20111122153332.GA20145@zim.MIT.EDU> <CAF6rxgmPeZCZ3c0xbd-4riqvLHob8U9eWG25R8P6FG2BjTfyyA@mail.gmail.com> <20111122202735.GA21442@zim.MIT.EDU>
next in thread | previous in thread | raw e-mail | index | archive | help
On 22 Nov 2011, at 20:27, David Schultz wrote: > Benchmark or not, I think you'll have a very hard time finding a > single real program that routinely calls strcasecmp() with > identical pointers! I've seen this pattern very often. Often the linker is able to combine = constant strings defined in different compilation units. With link-time = optimisation, there are also more opportunities for the compiler to do = this. =20 A fairly common pattern is to define constant strings as macros in a = header and then use them as keys in a dictionary, first hashed and then = compared with strcmp(). In this case, the =3D=3D check is a significant = win. I've had to work around the fact that FreeBSD's libc is = significantly slower than GNU libc in this instance by adding an extra = =3D=3D outside of strcmp() - this increases the size of the code = everywhere this pattern is used, increasing cache usage, and lowering = overall performance (and good luck coming up with a microbenchmark that = demonstrates that - although I'd be happy to provide you with a = Google-authord paper from a couple of years ago explaining why it's so = hard to benchmark accurately on modern machines...). It's also worth noting that the cost of the extra branch is more or less = trivial, as every single character in the input strings will also need = to be compared. This change turns a linear complexity case into a = constant complexity case, so it's a clear algorithmic improvement for a = case that, while rare, is not as improbable as you seem to suppose. As to the | vs || issue - by all means change it to || if it fits better = with the FreeBSD style. In the general case I prefer to use | to hint = to the compiler and readers of the code that short-circuit evaluation is = not required and to remove a sequence point and make life easier for the = optimiser. In this case, the two are equivalent so it's just a hint to = the reader, and apparently (judging by the responses so far) one that is = not well understood. David=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0DC88C34-91B4-49D1-AA8A-73B14C99D35B>