From owner-freebsd-current@FreeBSD.ORG Wed Feb 3 05:16:30 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AE1591065670; Wed, 3 Feb 2010 05:16:30 +0000 (UTC) (envelope-from yanefbsd@gmail.com) Received: from mail-pz0-f202.google.com (mail-pz0-f202.google.com [209.85.222.202]) by mx1.freebsd.org (Postfix) with ESMTP id 79B088FC0A; Wed, 3 Feb 2010 05:16:30 +0000 (UTC) Received: by pzk40 with SMTP id 40so972315pzk.7 for ; Tue, 02 Feb 2010 21:16:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=d2feevyHe0wBpoN8VSEk2d/fOgwvdXMIwp6lrr6cO28=; b=JLRJeAgoJlsSgfXK7LDvOvNvxbxU7a0nQntuBhiI6fG+Ph/aEAigxskSbg6RusuXPO woH9YBBQHDQYD999wsKZ/RFw96YTMeiDQ9S6x4dg7BtcYlXFj9usgKfiSRs2evH5/x0Y Z65LFdID8R9VGicOMdLLqKgWqrS26BpHdBoDs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=ZySeOiHWL8/vLcoTC3FBXsm1U8qhcexrgNXNPU1qL+OGxIXtuYd/B1F9WGqx0hceSW rNUjl/LThPALGLGeBg25zkv7d0GdAtP87tjduV6Te05bskUlSxMzBQw15TtTx9F9WKUk uAxeRw1iqFK9ekRIMMLksuPFiLmUkNPO0jYCo= MIME-Version: 1.0 Received: by 10.142.152.6 with SMTP id z6mr1200914wfd.214.1265174189885; Tue, 02 Feb 2010 21:16:29 -0800 (PST) In-Reply-To: <4B68F5EE.9060606@freebsd.org> References: <4B68F5EE.9060606@freebsd.org> Date: Tue, 2 Feb 2010 21:16:29 -0800 Message-ID: <7d6fde3d1002022116v28d50f75me27e7208619e2a3c@mail.gmail.com> From: Garrett Cooper To: David Xu Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org, Justin Teller Subject: Re: Bug in kern_umtx.c -- read-write locks X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2010 05:16:30 -0000 On Tue, Feb 2, 2010 at 8:05 PM, David Xu wrote: > Justin Teller wrote: >> >> I was working on a highly threaded app (125+ threads) that was using >> the pthread rw locks, and we were stalling at strange times. =A0After a >> lot of debugging in our app, we found that a call to >> pthread_rwlock_wrlock() would sometimes never return -- it seemed like >> a wakeup was lost. =A0After we convinced ourselves the bug wasn't in the >> app's locking code, I started digging into the kernel. =A0I found that >> there is an issue where a wakeup can be "lost" when a thread goes to >> sleep calling pthread_rwlock_wrlock. =A0The issue is in the file >> kern_umtx.c in the function do_rw_wrlock(): the code busies the lock >> before sleeping, but when it tries to set the waiters bit, it's >> looking at at old value (from the "try-lock" just before the busy). >> This allows a race where a thread can go to sleep w/o setting the >> waiters bit. =A0Then the last thread to unlock won't wakeup the sleeping >> thread. =A0The patch below (based off of 8.0 release) fixes my problem >> for the write lock and should fix the complimentary issue in >> do_rw_rdlock. >> >> =A0 > > Committed, thanks! This might be the reason why the pthreaded application I was working on was crashing when I had it spawn more than 100 threads (I tried 2k and 20k simple, short-lived threads that used a basic mutex, and it got into some deadlock state and bombed)... I'll see whether or not this fixes my issue as well (but FWIW Linux sucked when I ran the pthreaded app too and was busting up all over the place)... Thanks! -Garrett