Date: Mon, 24 Sep 2007 11:52:41 -0400 From: John Baldwin <jhb@freebsd.org> To: "Attilio Rao" <attilio@freebsd.org> Cc: freebsd-smp@freebsd.org, freebsd-arch@freebsd.org Subject: Re: rwlocks: poor performance with adaptive spinning Message-ID: <200709241152.41660.jhb@freebsd.org> In-Reply-To: <3bbf2fe10709221932i386f65b9h6f47ab4bee08c528@mail.gmail.com> References: <3bbf2fe10709221932i386f65b9h6f47ab4bee08c528@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday 22 September 2007 10:32:06 pm Attilio Rao wrote: > Recently several people have reported problems of starvation with rwlocks. > In particular, users which tried to use rwlock on big SMP environment > (16+ CPUs) found them rather subjected to poor performances and to > starvation of waiters. > > Inspecting the code, something strange about adaptive spinning popped > up: basically, for rwlocks, adaptive spinning stubs seem to be > customed too down in the decisioning-loop. > The desposition of the stub will let the thread that would adaptively > spin, to set the respecitve (both read or write) waiters flag on, > which means that the owner of the lock will go down in the hard path > of locking functions and will performe a full wakeup even if the > waiters queues can result empty. This is a big penalty for adaptive > spinning which can make it completely useless. > In addiction to this, adaptive spinning only runs in the turnstile > spinlock path which is not ideal. > This patch ports the approach alredy used for adaptive spinning in sx > locks to rwlocks: > http://users.gufi.org/~rookie/works/patches/kern_rwlock.diff > > In sx it is unlikely to see big benefits because they are held for too > long times, but for rwlocks situation is rather different. > I would like to see if people can do benchmarks with this patch (maybe > in private environments?) as I'm not able to do them in short times. > > Adaptive spinning in rwlocks can be improved further with other tricks > (like adding a backoff counter, for example, or trying to spin with > the lock held in read mode too), but we first should be sure to start > with a solid base. I did this for mutexes and rwlocks over a year ago and Kris found it was slower in benchmarks. www.freebsd.org/~jhb/patches/lock_adapt.patch is the last thing I sent kris@ to test (it only has the mutex changes). This might be more optimal post-thread_lock since thread_lock seems to have heavily pessimized adaptive spinning because it now enqueues the thread and then dequeues it again before doing the adaptive spin. I liked the approach orginially because it simplifies the code a lot. A separate issue is that writers don't spin at all if a reader holds the lock, and I think one thing to test for that would be an adaptive spin with a static timeout. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200709241152.41660.jhb>