From owner-freebsd-arch@FreeBSD.ORG Tue Mar 7 14:37:11 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 505D916A424 for ; Tue, 7 Mar 2006 14:37:11 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id B61D343D5E for ; Tue, 7 Mar 2006 14:37:08 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k27Eb3ET026163; Tue, 7 Mar 2006 09:37:04 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: dima <_pppp@mail.ru> Date: Tue, 7 Mar 2006 09:30:14 -0500 User-Agent: KMail/1.9.1 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200603070930.16715.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1317/Tue Mar 7 01:06:47 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=4.2 tests=ALL_TRUSTED,AWL autolearn=ham version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx Cc: freebsd-arch@freebsd.org Subject: Re: wakeup idea... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Mar 2006 14:37:11 -0000 On Monday 06 March 2006 17:29, dima wrote: > >>>> Here is a possibly stupid idea. > >>>> > >>>> Historically sleep/wakeup have happened on a pointer which was just a magic > >>>> number. > >>>> > >>>> In many cases, this pointer is actually a relevant datastructure. > >>>> > >>>> Would it possibly be an optimization to make a variant of the sleep/wakeup > >>>> calls where the pointer points to an integer type which contains non-zero if > >>>> anybody is actually sleeping on that address ? > >>>> > >>>> Anybody up for a quick prototype ? > >>> > >>> In principle this is part of the point of a condition variable, which > >>> associates a struct with waitable conditions, and includes an int that > >>> contains the number of waiters: > >>> > >>> struct cv { > >>> const char *cv_description; > >>> int cv_waiters; > >>> }; > >>> > >>> Presumably the tricky bit is optimizing this such that you avoid undesirable > >>> races. (But maybe if you call cv_signal without the condition mutex held, you > >>> accept that race by definition?) > >> > >> I'm working on a new implementation of SX locks: http://wikitest.freebsd.org/moin.cgi/Usable_20lock_20implementation_20with_20SX_2dsemantics > >> Direct handoff (as described in "Solaris Internals") seems to do the trick. You just don't need any additional mutex for wakeup process since the current owner chooses who and when to wakeup. > >> Since CV-s and SX-es have very similar semantics I describe my ideas here. We would need 2 list-like datastructures: > >> 1. Current holders (for proper priority propagation) > >> 2. Current waiters (for direct handoff) > >> The first one can be implemented as an array; the reason of such a design decision is to avoid locking at SX acquire. The second list needs to be implemented as a (sorted by priority) queue (TAILQ for now) which needs a mutex for consistency. > > > > You might want to look at src/sys/kern/kern_rwlock.c in CVS HEAD. > > Sorry my ignorance. Gleb Smirnoff pointed that out privately already. I > looked at the code and it seems very much like OpenSolaris implementation > to me. You can't propagate priority properly if you don't hold all the > current lock holders somewhere. Yes, that is on purpose. You have to justify the overhead of keeping track of all the owners against the priority propagation for the read lock case. Actually, Solaris does keep track of one read owner (something I still plan to do or have someone else do) and they found that that handled many of the common cases w/o adding a significant amount of extra overhead. You should also create a different lock primitive (you can use the rwlock API if you want) rather than changing the behavior of sx locks because sx locks allow voluntary sleep and that is at odds with priority propagation (at least as those two things are currently implemented in FreeBSD). > I tried to implement it as an array in my > effort. As Attilio Rao mentioned, this allows only fixed amount of > concurrent readers at the time (which is not quite right, but can be > implemented as a per-lock sysctl tuneable). I chose an array to prevent > introducing one more mutex to the SX lock on the "short path". The code > might look as the following: > > u_int32_t old_flags = atomic_load_acq_32( &sx->flags ); You can just a simple assignment, you don't need an acq barrier on your load here I don't think. > if( atomic_cmpset_acq_32( &sx->flags, old_flags, > old_flags + SX_FLAG_READERS_IN ) ) { > for( i = 0; i < SX_MAX_CONCURRENT_HOLDERS; ++i ) { > if( sx->holders[i]->thr == curthread ) { > atomic_add_acq_int( &sx->holders[i]->count, 1 ); > ++count; > break; > } > else { > if( !sx->holders[i]->thr ) { > atomic_cmpset_acq_ptr( ( unsigned * )&sx->holders[i]->thr, > 0, ( unsigned )holder ); > atomic_set_acq_int( &sx->holders[i]->count, 1 ); > ++count; > break; You should either just use a simple store, or you should be checking the return value of atomic_cmpset() in your if. > } > } > } > if( i >= SX_MAX_CONCURRENT_HOLDERS ) { > atomic_subtract_rel_32( &sx->flags, SX_FLAG_READERS_IN ); > } > } > > This code is buggy, though. The array consisted of u_int32_t values first, > but it is hard to handle the recursive case this way. Introducing a > structure makes it impossible to handle the array atomically. I'm still > stuck at this point and some other issues (removing the locks on thread_exit > is the largest one among them). No thread should have a lock when it exits. You should be removing the thread entries when they unlock and then the thread_exit case will be taken care of (we already have witness checks to make sure exiting threads don't hold locks). > BTW atomic_*_ptr doesn't work as expected. I need to convert the values > to "unsigned *" to make gcc happy with "-Wall -pedantic" flags. You need to read the man page. :) atomic_*_ptr operate on uintptr_t variables rather than void * because volatile uintptr_t * != volatile void **. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org