From owner-freebsd-arch@FreeBSD.ORG Mon Mar 6 22:29:34 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2314216A420; Mon, 6 Mar 2006 22:29:34 +0000 (GMT) (envelope-from _pppp@mail.ru) Received: from f52.mail.ru (f52.mail.ru [194.67.57.87]) by mx1.FreeBSD.org (Postfix) with ESMTP id 508EB43D62; Mon, 6 Mar 2006 22:29:33 +0000 (GMT) (envelope-from _pppp@mail.ru) Received: from mail by f52.mail.ru with local id 1FGOCp-000BEU-00; Tue, 07 Mar 2006 01:29:31 +0300 Received: from [83.237.110.222] by koi.mail.ru with HTTP; Tue, 07 Mar 2006 01:29:31 +0300 From: dima <_pppp@mail.ru> To: John Baldwin Mime-Version: 1.0 X-Mailer: mPOP Web-Mail 2.19 X-Originating-IP: [83.237.110.222] Date: Tue, 07 Mar 2006 01:29:31 +0300 In-Reply-To: <200603061353.13756.jhb@freebsd.org> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Message-Id: Cc: freebsd-arch@freebsd.org Subject: Re[2]: wakeup idea... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: dima <_pppp@mail.ru> List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Mar 2006 22:29:34 -0000 >>>> Here is a possibly stupid idea. >>>> >>>> Historically sleep/wakeup have happened on a pointer which was just a magic >>>> number. >>>> >>>> In many cases, this pointer is actually a relevant datastructure. >>>> >>>> Would it possibly be an optimization to make a variant of the sleep/wakeup >>>> calls where the pointer points to an integer type which contains non-zero if >>>> anybody is actually sleeping on that address ? >>>> >>>> Anybody up for a quick prototype ? >>> >>> In principle this is part of the point of a condition variable, which >>> associates a struct with waitable conditions, and includes an int that >>> contains the number of waiters: >>> >>> struct cv { >>> const char *cv_description; >>> int cv_waiters; >>> }; >>> >>> Presumably the tricky bit is optimizing this such that you avoid undesirable >>> races. (But maybe if you call cv_signal without the condition mutex held, you >>> accept that race by definition?) >> >> I'm working on a new implementation of SX locks: http://wikitest.freebsd.org/moin.cgi/Usable_20lock_20implementation_20with_20SX_2dsemantics >> Direct handoff (as described in "Solaris Internals") seems to do the trick. You just don't need any additional mutex for wakeup process since the current owner chooses who and when to wakeup. >> Since CV-s and SX-es have very similar semantics I describe my ideas here. We would need 2 list-like datastructures: >> 1. Current holders (for proper priority propagation) >> 2. Current waiters (for direct handoff) >> The first one can be implemented as an array; the reason of such a design decision is to avoid locking at SX acquire. The second list needs to be implemented as a (sorted by priority) queue (TAILQ for now) which needs a mutex for consistency. > > You might want to look at src/sys/kern/kern_rwlock.c in CVS HEAD. Sorry my ignorance. Gleb Smirnoff pointed that out privately already. I looked at the code and it seems very much like OpenSolaris implementation to me. You can't propagate priority properly if you don't hold all the current lock holders somewhere. I tried to implement it as an array in my effort. As Attilio Rao mentioned, this allows only fixed amount of concurrent readers at the time (which is not quite right, but can be implemented as a per-lock sysctl tuneable). I chose an array to prevent introducing one more mutex to the SX lock on the "short path". The code might look as the following: u_int32_t old_flags = atomic_load_acq_32( &sx->flags ); if( atomic_cmpset_acq_32( &sx->flags, old_flags, old_flags + SX_FLAG_READERS_IN ) ) { for( i = 0; i < SX_MAX_CONCURRENT_HOLDERS; ++i ) { if( sx->holders[i]->thr == curthread ) { atomic_add_acq_int( &sx->holders[i]->count, 1 ); ++count; break; } else { if( !sx->holders[i]->thr ) { atomic_cmpset_acq_ptr( ( unsigned * )&sx->holders[i]->thr, 0, ( unsigned )holder ); atomic_set_acq_int( &sx->holders[i]->count, 1 ); ++count; break; } } } if( i >= SX_MAX_CONCURRENT_HOLDERS ) { atomic_subtract_rel_32( &sx->flags, SX_FLAG_READERS_IN ); } } This code is buggy, though. The array consisted of u_int32_t values first, but it is hard to handle the recursive case this way. Introducing a structure makes it impossible to handle the array atomically. I'm still stuck at this point and some other issues (removing the locks on thread_exit is the largest one among them). BTW atomic_*_ptr doesn't work as expected. I need to convert the values to "unsigned *" to make gcc happy with "-Wall -pedantic" flags.