From owner-freebsd-current@FreeBSD.ORG Wed Aug 26 23:38:14 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4C0E61065690 for ; Wed, 26 Aug 2009 23:38:14 +0000 (UTC) (envelope-from brampton@gmail.com) Received: from mail-ew0-f209.google.com (mail-ew0-f209.google.com [209.85.219.209]) by mx1.freebsd.org (Postfix) with ESMTP id A82178FC19 for ; Wed, 26 Aug 2009 23:38:13 +0000 (UTC) Received: by ewy5 with SMTP id 5so722065ewy.36 for ; Wed, 26 Aug 2009 16:38:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=tYsyhy6NlxTWOGA21FDewlG+s8sgsWhHCbTwY6dKHeE=; b=sLkTbgmhiqBe6Q2E7SL6mgJMXkP5HNnwbjB//zOd5TeOjEhVueoTYO4R55pvZctjbN 7xeiEzN86c98H2bgWQZ1tbZhuE9TbCTjsjcNoWuXWwbmulGOLilPbn+EKIhY/sePrEnL 4AxZ2GAe1xYuka/Qh25K3NM8qWPi3sbXHSPd8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=Rde7R0MAxGfkxfy7xcXtVKqQ7CFDjc7ELhfSVekCh5L3iWeDiBZRPkm5k7HZ68rkph jR3nJpglbcckLUkPacLCAvkmLiXRr/aCyFBbGJNs3dNvnE4W/24MDmOK96Kb9a+j6Ap5 8eHM1IMvA2ra16wO3tEQ7B7nvUXBHajlk7QuI= MIME-Version: 1.0 Sender: brampton@gmail.com Received: by 10.216.88.195 with SMTP id a45mr1773409wef.63.1251329892177; Wed, 26 Aug 2009 16:38:12 -0700 (PDT) In-Reply-To: <3bbf2fe10908261557i286f5ec5q58eb4fec1abbfc7b@mail.gmail.com> References: <3bbf2fe10908261557i286f5ec5q58eb4fec1abbfc7b@mail.gmail.com> Date: Thu, 27 Aug 2009 00:38:12 +0100 X-Google-Sender-Auth: df6b118631b6d975 Message-ID: From: Andrew Brampton To: Attilio Rao Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org Subject: Re: [patch] rw_try_wlock does not set recursive bit when recursing X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Aug 2009 23:38:14 -0000 2009/8/26 Attilio Rao : > 2009/8/27 Andrew Brampton : >> Hi, >> The following sequence of commands fails on line 4 with an assertion >> that the lock is not currently held: >> >> 1: rw_wlock(&rw); >> 2: if ( rw_try_wlock(&rw) ) >> 3: =C2=A0 =C2=A0rw_wunlock(&rw); >> 4: rw_wunlock(&rw); >> >> This is because after line 3 is executed the rw lock is no longer >> held. I tracked this bug down to _rw_try_wlock which correctly >> increments rw_recurse, but does not set the RW_LOCK_RECURSED bit. >> Without this bit the third line unlocks the lock and leaves it in a >> unlocked state (when it should still be locked). Adding a line to set >> this bit makes _rw_try_wlock match the code in _rw_wlock_hard. > > Sorry, but I really don't understand how that can be a bug. > On STABLE_7, RW_LOCK_RECURSED is not used for checking if the lock is > recursed or not. > it just got set for improving debugging and eventually we decided to > drop it for 8.0. > > However, for deciding if a lock is recursed or not in both STABLE_7 > and HEAD we used just checking against the recursion count which is > correctly handled by the function. > > What you describe can't be the problem. > > Attilio > Ok, so I have had a better look at the code in CURRENT, and compared it to the code in STABLE_7. I apologise but I think I mixed up my sources somewhere and the problem does not appear in CURRENT. The problem does however occur in STABLE_7, and I can explain that below. 1: rw_wlock(&rw); 2: if ( rw_try_wlock(&rw) ) 3: rw_wunlock(&rw); 4: rw_wunlock(&rw); Line 1, _rw_wlock gets called which calls __rw_wlock. This in turn calls _rw_write_lock which changes rw->rw_lock to tid Line 2, _rw_try_wlock gets called, which check if the lock is already held (which it is), if so it then rw->rw_recurse++ Line 3, _rw_wunlock gets called which calls __rw_wunlock, which then calls _rw_write_unlock. Now _rw_write_unlock trys to unlock the lock by checking if rw->rw_lock is tid, if so it sets the lock to RW_UNLOCKED. This is where the problem occurs, there is no check on rx->rw_recurse. Now, if we used this code instead: 1: rw_wlock(&rw); 2: if ( rw_try_wlock(&rw) ) 3: rw_wunlock(&rw); 4: rw_wunlock(&rw); The order goes: Line 1, _rw_wlock gets called which calls __rw_wlock. This in turn calls _rw_write_lock which changes rw->rw_lock to tid Line 2, _rw_wlock gets called which calls __rw_wlock. __rw_wlock checks if rw->rw_lock is RW_UNLOCKED, otherwise it ends up calling _rw_wlock_hard. Inside _rw_wlock_hard it checks if the lock is already held, if so it increments rw_recurse, and changes rw->rw_lock to (rw->rw_lock | RW_LOCK_RECURSED) by calling atomic_set_ptr. Line 3, _rw_wunlock gets called which calls __rw_wunlock, which then calls _rw_write_unlock. Now _rw_write_unlock trys to unlock the lock by checking if rw->rw_lock is tid. This is where things differ from the previous code, rw->rw_lock is NOT tid, it is now (tid | RW_LOCK_RECURSED), and thus the code drops into _rw_wunlock_hard which decrements rx->rw_recurse and removes the RW_LOCK_RECURSED flag if rw_recurse equals zero. Hopefully you see how not setting the RW_LOCK_RECURSED flag causes a problem in rw_wunlock. So this can be fixed by either back porting the CURRENT code to STABLE_7, or using the one line fix in my patch. As I already stated, I am using my patch and the sample code I have given no longer panics. Sorry for the confusion Andrew