Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 22 Aug 2010 21:05:47 +0200
From:      Max Laier <max@love2party.net>
To:        Stephan Uphoff <ups@freebsd.org>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: rmlock(9) two additions
Message-ID:  <201008222105.47276.max@love2party.net>
In-Reply-To: <4C7042BA.8000402@freebsd.org>
References:  <201008160515.21412.max@love2party.net> <4C7042BA.8000402@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday 21 August 2010 23:18:50 Stephan Uphoff wrote:
> Max Laier wrote:
> > Hi,
> > 
> > I'd like to run two additions to rmlock(9) by you:
> > 
> > 1) See the attached patch.  It adds rm_try_rlock() which avoids taking
> > the mutex if the lock is currently in a write episode.  The only
> > overhead to the hot path is an additional argument and a return value
> > for _rm_rlock*.  If you are worried about that, it can obviously be done
> > in a separate code path, but I reckon it not worth the code crunch. 
> > Finally, there is one additional branch to check the "trylock" argument,
> > but that's well past the hot path.
> 
> The rm_try_rlock() will never succeed when the lock was last locked as
> writer.
> Maybe add:
> 
> void
> _rm_wunlock(struct rmlock *rm)
> {
> +   rm->rm_noreadtoken = 0;
>     mtx_unlock(&rm->rm_lock);
> }
> 
> But then
> 
> _rm_wlock(struct rmlock *rm)
> 
> always needs to use IPIs - even when the lock was used last as a write
> lock.

I don't think this is a big problem - I can't see many use cases for rmlocks 
where you'd routinely see repeated wlocks without rlocks between them.  
However, I think there should be a release memory barrier before/while 
clearing rm_noreadtoken, otherwise readers may not see the data writes that 
are supposed to be protected by the lock?!?

> Alternatively something like:
> 
>     if (trylock) {
>          if(mtx_trylock( &rm->rm_lock) == 0)
>               return (0);
>    }
>     else
>   {
>     mtx_lock(&rm->rm_lock);
>   }
> 
>   would work - but has a race. Two readers colliding just after a writer
> (with the second not succeeding in trylocking the mutex) leads to not
> granting the read lock (also it would be possible to do so).

Also not too much of a problem, in my opinion.  There is no time order between 
read locks and thus it is okay to grant one and fail another - eventhough they 
arrive "at the same time".  A caller to trylock must always accept failure 
(unless it's a recursive use - and this is handled).

> Let me think about it a bit.

I believe either solution will work.  #1 is a bit more in the spirit of the 
rmlock - i.e. make the read case cheap and the write case expensive.  I'm just 
not sure about the lock semantics.

I guess a

  atomic_store_rel_int(&rm->rm_noreadtoken, 0);

should work.

> > 2) No code for this yet - testing the waters first.  I'd like to add the
> > ability to replace the mutex for writer synchronization with a general
> > lock - especially to be able to use a sx(9) lock here.
> > 
> > The reason for #2 is the following use case in a debugging facility:
> > 	"reader":
> > 		if (rm_try_rlock()) {
> > 		
> > 			grab_per_cpu_buffer();
> > 			fill_per_cpu_buffer();
> > 			rm_runlock();
> > 		
> > 		}
> > 	
> > 	"writer" - better exclusive access thread:
> > 		rm_wlock();
> > 		collect_buffers_and_copy_out_to_userspace();
> > 		rm_wunlock();
> > 
> > This is much cleaner and possibly cheaper than the various hand rolled
> > versions I've come across, that try to get the same synchronization with
> > atomic operations.  If we could sleep with the wlock held, we can also
> > avoid copying the buffer contents, or swapping buffers.
> > 
> > Is there any concern about either of this?  Any objection?  Input?
> 
> Will take a look at your second patch soonish.
> 
> Just ask per IPI for a copy of per cpu buffers (but not a copy to user
> space) - and delay the copy when an update is in progress?

Think huge circular per cpu buffer that are filled at high rates.  Of course 
we could allocate new buffers and swap out while locked, but since this is a 
debugging facility it is better to miss a few events while copying out, rather 
than spending twice the memory.

Thanks,
  Max



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201008222105.47276.max>