Date: Tue, 14 Oct 2014 22:36:38 +0200 From: Jilles Tjoelker <jilles@stack.nl> To: John Baldwin <jhb@freebsd.org> Cc: adrian@freebsd.org, freebsd-threads@freebsd.org Subject: Re: sem_post() performance Message-ID: <20141014203638.GA23965@stack.nl> In-Reply-To: <17114533.JBQOYsdsdz@ralph.baldwin.cx> References: <20140921213742.GA46868@stack.nl> <17114533.JBQOYsdsdz@ralph.baldwin.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 13, 2014 at 05:35:09PM -0400, John Baldwin wrote: > On Sunday, September 21, 2014 11:37:42 PM Jilles Tjoelker wrote: > > It has been reported that POSIX semaphores are slow, in contexts such as > > Python. Note that POSIX semaphores are the only synchronization objects > > that support use by different processes in shared memory; this does not > > work for mutexes and condition variables because they are pointers to > > the actual data structure. > > In fact, sem_post() unconditionally performs an umtx system call. > > To avoid both lost wakeups and possible writes to a destroyed semaphore, > > an uncontested sem_post() must check the _has_waiters flag atomically > > with incrementing _count. > > The proper way to do this would be to take one bit from _count and use > > it for the _has_waiters flag; the definition of SEM_VALUE_MAX permits > > this. However, this would require a new set of umtx semaphore operations > > and will break ABI of process-shared semaphores (things may break if an > > old and a new libc access the same semaphore over shared memory). > Have you thought more about pursuing this option? I think there was a > general consensus from earlier in the thread to just break the ABI (at > least adjust SEM_MAGIC to give some protection) and fix it. I think this is a good direction but I haven't gotten around to it yet. > > This diff only affects 32-bit aligned but 64-bit misaligned semaphores > > on 64-bit systems, and changes _count and _has_waiters atomically using > > a 64-bit atomic operation. It probably needs a may_alias attribute for > > correctness, but <sys/cdefs.h> does not have a wrapper for that. > It does have one bug: > > + if (atomic_cmpset_rel_64((uint64_t *)&sem->_kern._count, > > + oldval, newval)) > This needs to be '&_has_waiters'. Right now it changes _count and _flags, > but not _has_waiters. This is probably because I was mistaken about the order of _count and _has_waiters, and only partially corrected that. Anyway, the strange alignment requirements make the patch of little practical use. -- Jilles Tjoelker
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141014203638.GA23965>