Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 07 Mar 2006 01:29:31 +0300
From:      dima <_pppp@mail.ru>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-arch@freebsd.org
Subject:   Re[2]: wakeup idea...
Message-ID:  <E1FGOCp-000BEU-00._pppp-mail-ru@f52.mail.ru>
In-Reply-To: <200603061353.13756.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
>>>> Here is a possibly stupid idea.
>>>>
>>>> Historically sleep/wakeup have happened on a pointer which was just a magic 
>>>> number.
>>>>
>>>> In many cases, this pointer is actually a relevant datastructure.
>>>>
>>>> Would it possibly be an optimization to make a variant of the sleep/wakeup 
>>>> calls where the pointer points to an integer type which contains non-zero if 
>>>> anybody is actually sleeping on that address ?
>>>>
>>>> Anybody up for a quick prototype ?
>>> 
>>> In principle this is part of the point of a condition variable, which 
>>> associates a struct with waitable conditions, and includes an int that 
>>> contains the number of waiters:
>>> 
>>> struct cv {
>>>          const char      *cv_description;
>>>          int             cv_waiters;
>>> };
>>> 
>>> Presumably the tricky bit is optimizing this such that you avoid undesirable 
>>> races.  (But maybe if you call cv_signal without the condition mutex held, you 
>>> accept that race by definition?)
>> 
>> I'm working on a new implementation of SX locks: http://wikitest.freebsd.org/moin.cgi/Usable_20lock_20implementation_20with_20SX_2dsemantics
>> Direct handoff (as described in "Solaris Internals") seems to do the trick. You just don't need any additional mutex for wakeup process since the current owner chooses who and when to wakeup.
>> Since CV-s and SX-es have very similar semantics I describe my ideas here. We would need 2 list-like datastructures:
>> 1. Current holders (for proper priority propagation)
>> 2. Current waiters (for direct handoff)
>> The first one can be implemented as an array; the reason of such a design decision is to avoid locking at SX acquire. The second list needs to be implemented as a (sorted by priority) queue (TAILQ for now) which needs a mutex for consistency.
> 
> You might want to look at src/sys/kern/kern_rwlock.c in CVS HEAD.

Sorry my ignorance. Gleb Smirnoff pointed that out privately already. I looked at the code and it seems very much like OpenSolaris implementation to me. You can't propagate priority properly if you don't hold all the current lock holders somewhere. I tried to implement it as an array in my effort. As Attilio Rao mentioned, this allows only fixed amount of concurrent readers at the time (which is not quite right, but can be implemented as a per-lock sysctl tuneable). I chose an array to prevent introducing one more mutex to the SX lock on the "short path". The code might look as the following:

  u_int32_t old_flags = atomic_load_acq_32( &sx->flags );
  if( atomic_cmpset_acq_32( &sx->flags, old_flags,
			    old_flags + SX_FLAG_READERS_IN ) ) {
    for( i = 0; i < SX_MAX_CONCURRENT_HOLDERS; ++i ) {
      if( sx->holders[i]->thr == curthread ) {
	atomic_add_acq_int( &sx->holders[i]->count, 1 );
	++count;
	break;
      }
      else {
	if( !sx->holders[i]->thr ) {
	  atomic_cmpset_acq_ptr( ( unsigned * )&sx->holders[i]->thr,
				 0, ( unsigned )holder );
	  atomic_set_acq_int( &sx->holders[i]->count, 1 );
	  ++count;
	  break;
	}
      }
    }
    if( i >= SX_MAX_CONCURRENT_HOLDERS ) {
      atomic_subtract_rel_32( &sx->flags, SX_FLAG_READERS_IN );
    }
  }

This code is buggy, though. The array consisted of u_int32_t values first, but it is hard to handle the recursive case this way. Introducing a structure makes it impossible to handle the array atomically. I'm still stuck at this point and some other issues (removing the locks on thread_exit is the largest one among them).

BTW atomic_*_ptr doesn't work as expected. I need to convert the values to "unsigned *" to make gcc happy with "-Wall -pedantic" flags.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1FGOCp-000BEU-00._pppp-mail-ru>