Date: Mon, 14 Jan 2013 18:24:04 +0000 From: David Chisnall <theraven@FreeBSD.org> To: Jilles Tjoelker <jilles@stack.nl> Cc: toolchain@FreeBSD.org, John Baldwin <jhb@FreeBSD.org>, freebsd-arch@FreeBSD.org Subject: Re: Fast sigblock (AKA rtld speedup) Message-ID: <D6772A0E-FBA4-4168-B152-7E7694720A16@FreeBSD.org> In-Reply-To: <20130114174703.GB88220@stack.nl> References: <20130107182235.GA65279@kib.kiev.ua> <20130112053147.GH2561@kib.kiev.ua> <20130112162547.GA54954@stack.nl> <201301141106.07976.jhb@freebsd.org> <20130114174703.GB88220@stack.nl>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] On 14 Jan 2013, at 17:47, Jilles Tjoelker wrote: > The code which does that check is actually under contrib/gcc. Problem > is, they designed __gthread_active_p() to distinguish threaded and > unthreaded programming environments -- it must be known in advance and > cannot be changed later. The code for the unthreaded environment then > takes advantage of this by not even allocating memory for mutexes in > some cases. It's worth taking a step back and asking why this code exists at all, and the main reason is that acquiring a mutex used to be really expensive. It still is on some fruit-flavoured operating systems, but elsewhere it's a single atomic operation in the uncontended case, and in that case the cache line will already be exclusively owned by the calling core in single-threaded code. I would much rather that we followed the example of Solaris and made the multithreaded case fast and the default than keep piling on hacks that allow code to shave off a few clock cycles in the single-threaded case. In particular, the popularity of multicore systems means that it is increasingly rare for code to be both single threaded and performance critical, so this seems like misplaced optimisation. I strongly suspect that making it possible to inline the uncontended lock case for a pthread mutex and eliminating all of the branches on __isthreaded would give us a net speedup in both single and multithreaded cases. > This __gthread_active_p() thing is another barrier to bringing in a > threaded plugin in an unthreaded application. Ports people spend a fair > amount of time adding -pthread flags to things (such as perl) to work > around this. This and the similar checks in libc cause a lot of pain, and it seems that the correct fix is ensuring that the performance penalty for linking libthr is so small that there is no point in avoiding it. David [-- Attachment #2 --] -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) iQIcBAEBAgAGBQJQ9E1FAAoJEKx65DEEsqIdUgcP/RQ4sF0gYsuCGiWUlmiioKnE 3ZQF76ieurN7Hbmq0WNs3eAHljFzQMRPrsgqQVcbyTNPuIuSX4zIdTGLLFwBijf0 X2R0nO6e7sHTYKtCcHmXFoH7DCoQSEG88F1q7zRA1RlvOF0hXDXHEYrSCpWeBMnC 5SwcYMgsZ5eXX9a5tvsUeq2/GyDcPYEkVhq3ueZRmVxIXoaL5Eq3qZ4hReJCLo/1 AnB+/c0dAMJQE6td8gdn7+8EcbHeAblGvpRJFYaNT56WiAVbu+ZOB1l2wNRzMM3e mYsg72pfUxcqb6WWwgk4pXqPQyIMT9pHCwden2rrEpzk7qHFQUV3odVyo2SXtA44 xMWBs2d2a8fmMRCW6wrtrpb1jlPo9W4KmQWpF+4Kaq2P8DuN0ljyTRSC5PQqM4ms saFYl6OOtRFPzD/6RUddklQIi2poBhVp6hAfA2qxq0otMN1ZmkpTsRtsNZltXbpp 9fyeHpc2IsBx9uM7ND2b5FQmdXKq1Zs0sF2HC3uhH2Q7F2r39TuM/0m5eayyJosZ bWExLzQmq5gpR6guEEV4pdgye33eCL1TvVgRGOPxmpenydqEyFiflcu16bh5wRU2 DkMJGe6r9OBKqnvNOrlrtE9P16906C9XL9QwUHfnjg440/WAFW2i44zj0U+PE3Bf +GwlZRaF3NgeTX7i7nmH =ADYu -----END PGP SIGNATURE-----help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D6772A0E-FBA4-4168-B152-7E7694720A16>
