From owner-freebsd-threads@FreeBSD.ORG Tue Sep 23 21:49:36 2014 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 01BC284B; Tue, 23 Sep 2014 21:49:36 +0000 (UTC) Received: from mail.netplex.net (mail.netplex.net [204.213.176.9]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.netplex.net", Issuer "RapidSSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B0851D91; Tue, 23 Sep 2014 21:49:35 +0000 (UTC) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.9/8.14.9/NETPLEX) with ESMTP id s8NLnSMZ013502; Tue, 23 Sep 2014 17:49:28 -0400 X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.4.3 (mail.netplex.net [204.213.176.9]); Tue, 23 Sep 2014 17:49:28 -0400 (EDT) Date: Tue, 23 Sep 2014 17:49:28 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net Reply-To: Daniel Eischen To: Jilles Tjoelker Subject: Re: sem_post() performance In-Reply-To: <20140923212000.GA78110@stack.nl> Message-ID: References: <20140921213742.GA46868@stack.nl> <1531724.MPBlj40xOW@ralph.baldwin.cx> <20140923212000.GA78110@stack.nl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: adrian@freebsd.org, freebsd-threads@freebsd.org X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 21:49:36 -0000 On Tue, 23 Sep 2014, Jilles Tjoelker wrote: > On Mon, Sep 22, 2014 at 03:53:13PM -0400, John Baldwin wrote: >> On Sunday, September 21, 2014 11:37:42 PM Jilles Tjoelker wrote: >>> It has been reported that POSIX semaphores are slow, in contexts such as >>> Python. Note that POSIX semaphores are the only synchronization objects >>> that support use by different processes in shared memory; this does not >>> work for mutexes and condition variables because they are pointers to >>> the actual data structure. > >>> In fact, sem_post() unconditionally performs an umtx system call. > >> *sigh* I was worried that that might be the case. > >>> To avoid both lost wakeups and possible writes to a destroyed semaphore, >>> an uncontested sem_post() must check the _has_waiters flag atomically >>> with incrementing _count. > >>> The proper way to do this would be to take one bit from _count and >>> use it for the _has_waiters flag; the definition of SEM_VALUE_MAX >>> permits this. However, this would require a new set of umtx >>> semaphore operations and will break ABI of process-shared semaphores >>> (things may break if an old and a new libc access the same semaphore >>> over shared memory). > >>> This diff only affects 32-bit aligned but 64-bit misaligned >>> semaphores on 64-bit systems, and changes _count and _has_waiters >>> atomically using a 64-bit atomic operation. It probably needs a >>> may_alias attribute for correctness, but does not have >>> a wrapper for that. > >> It wasn't clear on first reading, but you are using aliasing to get >> around the need for new umtx calls by using a 64-bit atomic op to >> adjust two ints at the same time, yes? Note that since a failing >> semaphore op calls into the kernel for the "hard" case, you might in >> fact be able to change the ABI without breaking process-shared >> semaphores. That is, suppose you left 'has_waiters' as always true >> and reused the high bit of count for has_waiters. > >> Would old binaries always trap into the kernel? (Not sure they will, >> especially the case where an old binary creates the semaphore, a new >> binary would have to force has_waiters to true in every sem op, but >> even that might not be enough.) > > I think that everything will break when a binary linked to old and new > libcs use the same semaphore. If the new contested bit is set, the old > sem_getvalue() will return garbage, the old sem_trywait() will fail even > if the real count is greater than 0, the old sem_wait() and > sem_timedwait() may spin if the real count is greater than 0 and the old > sem_post() will fail with [EOVERFLOW]. > > That the "hard" path always issues a system call does not help much, > since the system calls do not write to _count (this is an throughput > optimization, allowing a fast-path thread through while a slow-path > thread is entering or leaving the kernel). [ ... ] > Consideration: just declare mixing process-shared semaphores with > sufficiently different libc unsupported, and change SEM_MAGIC to enforce > that? (This does not prevent running old binaries, as long as they're > dynamically linked to libc and you use a new libc.so.) Yes and yes :-) And we need to add such a magic or version number to our mutex and CVs when we convert their types from pointers to actual structs. -- DE