From owner-svn-src-all@FreeBSD.ORG Mon Dec 14 19:49:57 2009 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 17CD7106568D; Mon, 14 Dec 2009 19:49:57 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id DB5B68FC19; Mon, 14 Dec 2009 19:49:56 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 88D4046B37; Mon, 14 Dec 2009 14:49:56 -0500 (EST) Received: from jhbbsd.localnet (unknown [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 3CF828A025; Mon, 14 Dec 2009 14:49:55 -0500 (EST) From: John Baldwin To: Attilio Rao Date: Mon, 14 Dec 2009 14:18:33 -0500 User-Agent: KMail/1.12.1 (FreeBSD/7.2-CBSD-20091103; KDE/4.3.1; amd64; ; ) References: <200912122131.nBCLV71f064304@svn.freebsd.org> <200912141013.32839.jhb@freebsd.org> <3bbf2fe10912140902m407fa766q3a5e5bb6993723f9@mail.gmail.com> In-Reply-To: <3bbf2fe10912140902m407fa766q3a5e5bb6993723f9@mail.gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <200912141418.33255.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 14 Dec 2009 14:49:55 -0500 (EST) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r200447 - in head: share/man/man9 sys/kern sys/sys X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 19:49:57 -0000 On Monday 14 December 2009 12:02:54 pm Attilio Rao wrote: > 2009/12/14 John Baldwin : > > On Saturday 12 December 2009 4:31:07 pm Attilio Rao wrote: > >> Author: attilio > >> Date: Sat Dec 12 21:31:07 2009 > >> New Revision: 200447 > >> URL: http://svn.freebsd.org/changeset/base/200447 > >> > >> Log: > >> In current code, threads performing an interruptible sleep (on both > >> sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will > >> leave the waiters flag on forcing the owner to do a wakeup even when if > >> the waiter queue is empty. > >> That operation may lead to a deadlock in the case of doing a fake wakeup > >> on the "preferred" (based on the wakeup algorithm) queue while the other > >> queue has real waiters on it, because nobody is going to wakeup the 2nd > >> queue waiters and they will sleep indefinitively. > >> > >> A similar bug, is present, for lockmgr in the case the waiters are > >> sleeping with LK_SLEEPFAIL on. In this case, even if the waiters queue > >> is not empty, the waiters won't progress after being awake but they will > >> just fail, still not taking care of the 2nd queue waiters (as instead the > >> lock owned doing the wakeup would expect). > >> > >> In order to fix this bug in a cheap way (without adding too much locking > >> and complicating too much the semantic) add a sleepqueue interface which > >> does report the actual number of waiters on a specified queue of a > >> waitchannel (sleepq_sleepcnt()) and use it in order to determine if the > >> exclusive waiters (or shared waiters) are actually present on the lockmgr > >> (or sx) before to give them precedence in the wakeup algorithm. > >> This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to > >> cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters > >> a lockmgr has and if all the waiters on the exclusive waiters queue are > >> LK_SLEEPFAIL just wake both queues. > >> > >> The sleepq_sleepcnt() introduction and ABI breakage require > >> __FreeBSD_version bumping. > > > > Hmm, do you need an actual count of waiters or would a 'sleepq_empty()' > > (similar to turnstile_empty()) method be sufficient? > > I need the count in order to fix properly LK_SLEEPFAIL case (the idea > is: track exclusive waiters with LK_SLEEPFAIL on; if the number is > equal to the actual sleepers on the queue then wake up both queues, > otherwise nobody is going to take care of the shared waiters queue). Ok. -- John Baldwin