Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 12 Jan 2013 23:16:40 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Jilles Tjoelker <jilles@stack.nl>
Cc:        arch@freebsd.org, toolchain@freebsd.org
Subject:   Re: Fast sigblock (AKA rtld speedup)
Message-ID:  <20130112211640.GN2561@kib.kiev.ua>
In-Reply-To: <20130112162547.GA54954@stack.nl>
References:  <20130107182235.GA65279@kib.kiev.ua> <20130111095459.GZ2561@kib.kiev.ua> <50EFE830.3030500@freebsd.org> <20130111204938.GD2561@kib.kiev.ua> <20130111232906.GA29017@stack.nl> <20130112053147.GH2561@kib.kiev.ua> <20130112162547.GA54954@stack.nl>

next in thread | previous in thread | raw e-mail | index | archive | help

--4cokgWgqjr3t8EL1
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Jan 12, 2013 at 05:25:47PM +0100, Jilles Tjoelker wrote:
> On Sat, Jan 12, 2013 at 07:31:48AM +0200, Konstantin Belousov wrote:
> > On Sat, Jan 12, 2013 at 12:29:06AM +0100, Jilles Tjoelker wrote:
> > > On Fri, Jan 11, 2013 at 10:49:38PM +0200, Konstantin Belousov wrote:
> > > > http://people.freebsd.org/~kib/misc/rtld-sigblock.3.patch
>=20
> > > The new fields td_sigblock_ptr and td_sigblock_val are in the part th=
at
> > > is zeroed for new threads, while the code in rtld appears to expect t=
hem
> > > to be copied (on fork, vfork and pthread_create). The fields are
> > > correctly zeroed on exec.
> > Thank you for noting this. Should be fixed in
> > http://people.freebsd.org/~kib/misc/rtld-sigblock.4.patch
>=20
> > > Sharing the magic variable between threads means that one thread hold=
ing
> > > an rtld lock will block signals for all other threads as well. This is
> > > different from how the normal signal mask works but I don't know whet=
her
> > > it may break things. However, the patch depends on it in some way
> > > because sigtd() does not know about fast sigblock and will therefore
> > > happily select a thread that is fast-sigblocking to handle a signal
> > > directed at the process.
>=20
> > Hm, I do not quite follow, at least not the first part of the paragraph.
>=20
> > The fast sigblock pointer is per-thread, so it is not shared in the ker=
nel.
> > Regardless of the kernel side, rtld is only supposed to use the fast
> > block in the default implementation of the rtld locks, which are overri=
den
> > by the libthr implementation on libthr initialization. There is also an
> > explicit hand-off from the default locks to the external (libthr), and
> > rtld cares to turn off fast sigblock mechanism when the handoff is
> > performed.
>=20
> > The selection of the thread for the delivery of signal in the mt process
> > should not matter then, due to the mechanism not supposed to be used
> > in the mt process.
>=20
> OK, you are right. If multiple threads exist, the patched code disables
> fast sigblock and the current unpatched code already postpones signal
> handlers without sigprocmask() syscalls.
>=20
> This suggests a different rather simpler change. Libthr can always use
> its rtld lock implementation instead of only when multiple threads
> exist. This avoids the sigprocmask() syscalls and should not be much
> slower than the default implementation in rtld because that also
> contains atomic operations (both the unpatched and the patched version).
> People that care about performance of exceptions can then link in libthr
> (in many cases, it is already linked in to allow for (the possibility
> of) threading).
>=20
> I have tested this and exceptions were indeed more than twice as fast in
> my test program if I created an extra thread doing nothing than in the
> fully single-threaded version (with or without libthr).
>=20
> With that, I think fast sigblock is too much code and complication for a
> niche case.
Ok, thank you for the review. I am reverting the branch.

I think a solution would need to wait while the work to merge libc and
libthr is done.
>=20
> Most of the extra atomics in multi-threaded applications are conditional
> on __isthreaded (or can be made so); therefore, performance loss from
> linking in libthr should be negligible in most cases.
>=20
> > > Although libthr's postpone approach is somewhat ugly, it does not dep=
end
> > > on any non-standard kernel features and does not delay the default
> > > action. Would it be possible to move that code to libc to make things
> > > easier for rtld? It looks like this requires teaching libc about vari=
ous
> > > threading concepts, though.
>=20
> > The concern there is that rtld would need to have a tight coupling
> > with libc, and possibly with libthr too.
>=20
> Libc could call _rtld_thread_init() during initialization, somewhat like
> libthr does now. The problems here are in libc and libthr, not in rtld.
> Before libc is ready, the current code can be used; if libc is always
> initialized first, the sigprocmask() calls can be removed because there
> is no way a signal handler can be installed that early.
>=20
> --=20
> Jilles Tjoelker

--4cokgWgqjr3t8EL1
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJQ8dK2AAoJEJDCuSvBvK1BYkMP/i+lV09PclmW2T/9PjvEm8rZ
E4xLnrt4S5tBt1OZLqlSnSRsZ4KUUg0PfOIq7YODHkE6YjaonUb2wVcxa5hPHST3
ea+7waOb0ifKIBgVZ6pkQtHG3IbqRimiJrxVIg7jnz+BSGgA1vpz6shWLT9+aV/A
524lTC6zUiRj9JyAYDlevQ2VTDMAW/BcYS8vI9+9Qmg4nIxsSYTWLcpOlr/iJGCd
fxp431WOv+XYlSaknzx+fEBrelXEEEfgY1DbW3BjYH/zqZvceZvQN4fCWmSlJ/uf
5dS86FiykyJb1KqKvQy00esVHIwsMlmhgb4BSzsNuhBPZZTvxlA69FkUONP/s9jb
ht8j2FeCaGE3He80IaORWJCAjpPbdeYlBM1Q3D0Rwz3R8zXrjHGbBMhgEKXvEcq3
ZPouvvsuMX6NYlLlCVpgxTOfxmdN4tdCXZolGyb7VAMi9bjCSVrpEJlflxJAv65t
JUvJ1+kvcCb7/Bb+oAvC35SpUB7tMX4zIS+0S9XkJCAk/9JF5PgLvKrbN+aRK5Bz
bqEIE2gSU4QfVOCigJv3QAt7ZIFpGiRZe5mkcKximPuVaQRBNFKzrt1so2hxfuHE
5Sjsit8zujDlNGejtLM5/Yl+uJYoNick+icMk7/8ubCzQyPyS152ZkfOfax4S8//
PI1LW0Kv6qMyukPKjJcA
=FUvI
-----END PGP SIGNATURE-----

--4cokgWgqjr3t8EL1--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130112211640.GN2561>