Date: Mon, 30 Jan 2012 16:26:31 -0500 From: David Schultz <das@freebsd.org> To: David Chisnall <theraven@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r227753 - in head: contrib/gdtoa include lib/libc/gdtoa lib/libc/gen lib/libc/locale lib/libc/regex lib/libc/stdio lib/libc/stdlib lib/libc/stdtime lib/libc/string Message-ID: <20120130212631.GA45106@zim.MIT.EDU> In-Reply-To: <318F6CAA-D37E-49E9-A147-E21DE803EFB7@FreeBSD.org> References: <201111201445.pAKEjgNR096676@svn.freebsd.org> <20120118190714.GA13375@zim.MIT.EDU> <318F6CAA-D37E-49E9-A147-E21DE803EFB7@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jan 30, 2012, David Chisnall wrote: > On 18 Jan 2012, at 19:07, David Schultz wrote: > > > This patch appears to cause a large performance regression. For > > example, I measured a 78% slowdown for strtol(" 42", ...). > > That's definitely worth taking a closer look at. I think we can cache some things in TLS and avoid some pthread_getspecific calls. The current code is the 'make it work' version. The 'make it fast' version is planned... Sounds good; I look forward to it. > > Furthermore, the resulting static binary for a trivial program > > goes from 7k to 303k, due to pulling in malloc, stdio, and all the > > pthread stubs. > > That's not ideal, but I'm not sure if it's avoidable. Is statically linking libc something people regularly do? Aside from bde, probably not many. This is definitely a second-order concern. FreeBSD has a set of statically linked binaries in /rescue for situations where /lib gets screwed up. Space is an issue there because the root partition is historically sized quite small. Embedded folks might also care, but I'll let them speak for themselves. I did get a request several years ago from an embedded developer to unbreak the NO_FLOATING_POINT option in libc, and you could imagine perhaps a NO_LOCALE option as well. > Yup. A quick-and-dirty hack would be to add a flag that was set on the first call to uselocale() and to always use the global locale if this is not set. That should remove a lot of the overhead in cases where no one uses the per-thread locales. > > We can also probably store the locale in TLS, which (on platforms with fast TLS) should speed up the lookup a bit. I thought that's what thread_(get|set)_locale already did. Actually, it's counterintuitive that it would be significantly slower to access per-thread state than global state. Any idea why? Maybe it says something about our pthread_getspecific() implementation. I will run the code through a profiler some day, but I don't have the cycles right now.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120130212631.GA45106>