Date: Sat, 12 Jan 2013 23:16:40 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Jilles Tjoelker <jilles@stack.nl> Cc: arch@freebsd.org, toolchain@freebsd.org Subject: Re: Fast sigblock (AKA rtld speedup) Message-ID: <20130112211640.GN2561@kib.kiev.ua> In-Reply-To: <20130112162547.GA54954@stack.nl> References: <20130107182235.GA65279@kib.kiev.ua> <20130111095459.GZ2561@kib.kiev.ua> <50EFE830.3030500@freebsd.org> <20130111204938.GD2561@kib.kiev.ua> <20130111232906.GA29017@stack.nl> <20130112053147.GH2561@kib.kiev.ua> <20130112162547.GA54954@stack.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
--4cokgWgqjr3t8EL1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Jan 12, 2013 at 05:25:47PM +0100, Jilles Tjoelker wrote: > On Sat, Jan 12, 2013 at 07:31:48AM +0200, Konstantin Belousov wrote: > > On Sat, Jan 12, 2013 at 12:29:06AM +0100, Jilles Tjoelker wrote: > > > On Fri, Jan 11, 2013 at 10:49:38PM +0200, Konstantin Belousov wrote: > > > > http://people.freebsd.org/~kib/misc/rtld-sigblock.3.patch >=20 > > > The new fields td_sigblock_ptr and td_sigblock_val are in the part th= at > > > is zeroed for new threads, while the code in rtld appears to expect t= hem > > > to be copied (on fork, vfork and pthread_create). The fields are > > > correctly zeroed on exec. > > Thank you for noting this. Should be fixed in > > http://people.freebsd.org/~kib/misc/rtld-sigblock.4.patch >=20 > > > Sharing the magic variable between threads means that one thread hold= ing > > > an rtld lock will block signals for all other threads as well. This is > > > different from how the normal signal mask works but I don't know whet= her > > > it may break things. However, the patch depends on it in some way > > > because sigtd() does not know about fast sigblock and will therefore > > > happily select a thread that is fast-sigblocking to handle a signal > > > directed at the process. >=20 > > Hm, I do not quite follow, at least not the first part of the paragraph. >=20 > > The fast sigblock pointer is per-thread, so it is not shared in the ker= nel. > > Regardless of the kernel side, rtld is only supposed to use the fast > > block in the default implementation of the rtld locks, which are overri= den > > by the libthr implementation on libthr initialization. There is also an > > explicit hand-off from the default locks to the external (libthr), and > > rtld cares to turn off fast sigblock mechanism when the handoff is > > performed. >=20 > > The selection of the thread for the delivery of signal in the mt process > > should not matter then, due to the mechanism not supposed to be used > > in the mt process. >=20 > OK, you are right. If multiple threads exist, the patched code disables > fast sigblock and the current unpatched code already postpones signal > handlers without sigprocmask() syscalls. >=20 > This suggests a different rather simpler change. Libthr can always use > its rtld lock implementation instead of only when multiple threads > exist. This avoids the sigprocmask() syscalls and should not be much > slower than the default implementation in rtld because that also > contains atomic operations (both the unpatched and the patched version). > People that care about performance of exceptions can then link in libthr > (in many cases, it is already linked in to allow for (the possibility > of) threading). >=20 > I have tested this and exceptions were indeed more than twice as fast in > my test program if I created an extra thread doing nothing than in the > fully single-threaded version (with or without libthr). >=20 > With that, I think fast sigblock is too much code and complication for a > niche case. Ok, thank you for the review. I am reverting the branch. I think a solution would need to wait while the work to merge libc and libthr is done. >=20 > Most of the extra atomics in multi-threaded applications are conditional > on __isthreaded (or can be made so); therefore, performance loss from > linking in libthr should be negligible in most cases. >=20 > > > Although libthr's postpone approach is somewhat ugly, it does not dep= end > > > on any non-standard kernel features and does not delay the default > > > action. Would it be possible to move that code to libc to make things > > > easier for rtld? It looks like this requires teaching libc about vari= ous > > > threading concepts, though. >=20 > > The concern there is that rtld would need to have a tight coupling > > with libc, and possibly with libthr too. >=20 > Libc could call _rtld_thread_init() during initialization, somewhat like > libthr does now. The problems here are in libc and libthr, not in rtld. > Before libc is ready, the current code can be used; if libc is always > initialized first, the sigprocmask() calls can be removed because there > is no way a signal handler can be installed that early. >=20 > --=20 > Jilles Tjoelker --4cokgWgqjr3t8EL1 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJQ8dK2AAoJEJDCuSvBvK1BYkMP/i+lV09PclmW2T/9PjvEm8rZ E4xLnrt4S5tBt1OZLqlSnSRsZ4KUUg0PfOIq7YODHkE6YjaonUb2wVcxa5hPHST3 ea+7waOb0ifKIBgVZ6pkQtHG3IbqRimiJrxVIg7jnz+BSGgA1vpz6shWLT9+aV/A 524lTC6zUiRj9JyAYDlevQ2VTDMAW/BcYS8vI9+9Qmg4nIxsSYTWLcpOlr/iJGCd fxp431WOv+XYlSaknzx+fEBrelXEEEfgY1DbW3BjYH/zqZvceZvQN4fCWmSlJ/uf 5dS86FiykyJb1KqKvQy00esVHIwsMlmhgb4BSzsNuhBPZZTvxlA69FkUONP/s9jb ht8j2FeCaGE3He80IaORWJCAjpPbdeYlBM1Q3D0Rwz3R8zXrjHGbBMhgEKXvEcq3 ZPouvvsuMX6NYlLlCVpgxTOfxmdN4tdCXZolGyb7VAMi9bjCSVrpEJlflxJAv65t JUvJ1+kvcCb7/Bb+oAvC35SpUB7tMX4zIS+0S9XkJCAk/9JF5PgLvKrbN+aRK5Bz bqEIE2gSU4QfVOCigJv3QAt7ZIFpGiRZe5mkcKximPuVaQRBNFKzrt1so2hxfuHE 5Sjsit8zujDlNGejtLM5/Yl+uJYoNick+icMk7/8ubCzQyPyS152ZkfOfax4S8// PI1LW0Kv6qMyukPKjJcA =FUvI -----END PGP SIGNATURE----- --4cokgWgqjr3t8EL1--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130112211640.GN2561>