From owner-freebsd-current@FreeBSD.ORG  Tue Feb  2 20:07:41 2010
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 21724106566B
	for <freebsd-current@freebsd.org>; Tue,  2 Feb 2010 20:07:41 +0000 (UTC)
	(envelope-from justin.teller@gmail.com)
Received: from mail-qy0-f176.google.com (mail-qy0-f176.google.com
	[209.85.221.176])
	by mx1.freebsd.org (Postfix) with ESMTP id BE3B58FC1D
	for <freebsd-current@freebsd.org>; Tue,  2 Feb 2010 20:07:40 +0000 (UTC)
Received: by qyk6 with SMTP id 6so372163qyk.3
	for <multiple recipients>; Tue, 02 Feb 2010 12:07:39 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:date:message-id:subject
	:from:to:cc:content-type;
	bh=rBUqU/TDwWYTOaN05pSAZrqpoccVvrCEWUdiR8EPXXE=;
	b=sDkO55FhhGJLGRBqWJaipc3Y9ln2VQR3oRbdr9VAqacggjoAEOIxhWY+8C2S00pVfu
	OEqHUL7Kel1OVpdghTPBJTsUKhP0wB+KJkssHF9m3FF1vco4zGDx0nmhhRfvAMDqEtxA
	PUfEq7PQHqSxUanNHRSGKswrpHXoLvWTEmXz0=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:cc:content-type;
	b=JGPozDOpPGkyVEo2GYm8JvXrXqeCBBSvAHRX+JR3puikSOnmLR7gQK1OumdT5e+M1b
	iamla3118h42GP23c0f35LxMuhhACgr0A2frDpkoqpToUa+SCoq1BGzXf6yuOY3tzoLt
	CstMWsiqXrtyOLSB2FvfqE8piXV5ZpneqE/QU=
MIME-Version: 1.0
Received: by 10.224.72.228 with SMTP id n36mr3072731qaj.138.1265141258980; 
	Tue, 02 Feb 2010 12:07:38 -0800 (PST)
Date: Tue, 2 Feb 2010 12:07:38 -0800
Message-ID: <c47019cc1002021207rc9ca100ga0dd865febdb0014@mail.gmail.com>
From: Justin Teller <justin.teller@gmail.com>
To: freebsd-current@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Cc: David Xu <davidxu@freebsd.org>
Subject: Bug in kern_umtx.c -- read-write locks
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Feb 2010 20:07:41 -0000

I was working on a highly threaded app (125+ threads) that was using
the pthread rw locks, and we were stalling at strange times.  After a
lot of debugging in our app, we found that a call to
pthread_rwlock_wrlock() would sometimes never return -- it seemed like
a wakeup was lost.  After we convinced ourselves the bug wasn't in the
app's locking code, I started digging into the kernel.  I found that
there is an issue where a wakeup can be "lost" when a thread goes to
sleep calling pthread_rwlock_wrlock.  The issue is in the file
kern_umtx.c in the function do_rw_wrlock(): the code busies the lock
before sleeping, but when it tries to set the waiters bit, it's
looking at at old value (from the "try-lock" just before the busy).
This allows a race where a thread can go to sleep w/o setting the
waiters bit.  Then the last thread to unlock won't wakeup the sleeping
thread.  The patch below (based off of 8.0 release) fixes my problem
for the write lock and should fix the complimentary issue in
do_rw_rdlock.
--- kern_umtx.c	2010-02-01 13:03:24.000000000 -0800
+++ /kernel_src_8.0/usr/src/sys/kern/kern_umtx.c	2010-02-01
09:52:18.000000000 -0800
@@ -2461,10 +2461,15 @@
 		/* grab monitor lock */
 		umtxq_lock(&uq->uq_key);
 		umtxq_busy(&uq->uq_key);
 		umtxq_unlock(&uq->uq_key);

+		/* re-read the state, in case it changed between the try-lock above
+		 * and the check below
+		 */
+		state = fuword32(__DEVOLATILE(int32_t *, &rwlock->rw_state));
+
 		/* set read contention bit */
 		while ((state & wrflags) && !(state & URWLOCK_READ_WAITERS)) {
 			oldstate = casuword32(&rwlock->rw_state, state, state |
URWLOCK_READ_WAITERS);
 			if (oldstate == state)
 				goto sleep;
@@ -2592,10 +2597,15 @@

 		/* grab monitor lock */
 		umtxq_lock(&uq->uq_key);
 		umtxq_busy(&uq->uq_key);
 		umtxq_unlock(&uq->uq_key);
+
+		/* re-read the state, in case it changed between the try-lock above
+		 * and the check below
+		 */
+		state = fuword32(__DEVOLATILE(int32_t *, &rwlock->rw_state));

 		while (((state & URWLOCK_WRITE_OWNER) || URWLOCK_READER_COUNT(state) != 0) &&
 		       (state & URWLOCK_WRITE_WAITERS) == 0) {
 			oldstate = casuword32(&rwlock->rw_state, state, state |
URWLOCK_WRITE_WAITERS);
 			if (oldstate == state)


Looking thru the open PRs, it looks like it might be the cause of
#135673, but I'm not completely sure.  Also, I'm pretty new to
contributing to the FreeBSD community -- what's the common practice of
responding to a PR?

Thanks
-Justin