From owner-cvs-src-old@FreeBSD.ORG Sat Dec 12 21:34:45 2009 Return-Path: Delivered-To: cvs-src-old@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9A58310656AC for ; Sat, 12 Dec 2009 21:34:45 +0000 (UTC) (envelope-from attilio@FreeBSD.org) Received: from repoman.freebsd.org (repoman.freebsd.org [IPv6:2001:4f8:fff6::29]) by mx1.freebsd.org (Postfix) with ESMTP id 6D3DF8FC17 for ; Sat, 12 Dec 2009 21:34:45 +0000 (UTC) Received: from repoman.freebsd.org (localhost [127.0.0.1]) by repoman.freebsd.org (8.14.3/8.14.3) with ESMTP id nBCLYjrV059189 for ; Sat, 12 Dec 2009 21:34:45 GMT (envelope-from attilio@repoman.freebsd.org) Received: (from svn2cvs@localhost) by repoman.freebsd.org (8.14.3/8.14.3/Submit) id nBCLYjLn059188 for cvs-src-old@freebsd.org; Sat, 12 Dec 2009 21:34:45 GMT (envelope-from attilio@repoman.freebsd.org) Message-Id: <200912122134.nBCLYjLn059188@repoman.freebsd.org> X-Authentication-Warning: repoman.freebsd.org: svn2cvs set sender to attilio@repoman.freebsd.org using -f From: Attilio Rao Date: Sat, 12 Dec 2009 21:31:07 +0000 (UTC) To: cvs-src-old@freebsd.org X-FreeBSD-CVS-Branch: HEAD Subject: cvs commit: src/share/man/man9 sleepqueue.9 src/sys/kern kern_lock.c kern_sx.c subr_sleepqueue.c src/sys/sys _lockmgr.h param.h sleepqueue.h X-BeenThere: cvs-src-old@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: **OBSOLETE** CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2009 21:34:45 -0000 attilio 2009-12-12 21:31:07 UTC FreeBSD src repository Modified files: share/man/man9 sleepqueue.9 sys/kern kern_lock.c kern_sx.c subr_sleepqueue.c sys/sys _lockmgr.h param.h sleepqueue.h Log: SVN rev 200447 on 2009-12-12 21:31:07Z by attilio In current code, threads performing an interruptible sleep (on both sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will leave the waiters flag on forcing the owner to do a wakeup even when if the waiter queue is empty. That operation may lead to a deadlock in the case of doing a fake wakeup on the "preferred" (based on the wakeup algorithm) queue while the other queue has real waiters on it, because nobody is going to wakeup the 2nd queue waiters and they will sleep indefinitively. A similar bug, is present, for lockmgr in the case the waiters are sleeping with LK_SLEEPFAIL on. In this case, even if the waiters queue is not empty, the waiters won't progress after being awake but they will just fail, still not taking care of the 2nd queue waiters (as instead the lock owned doing the wakeup would expect). In order to fix this bug in a cheap way (without adding too much locking and complicating too much the semantic) add a sleepqueue interface which does report the actual number of waiters on a specified queue of a waitchannel (sleepq_sleepcnt()) and use it in order to determine if the exclusive waiters (or shared waiters) are actually present on the lockmgr (or sx) before to give them precedence in the wakeup algorithm. This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters a lockmgr has and if all the waiters on the exclusive waiters queue are LK_SLEEPFAIL just wake both queues. The sleepq_sleepcnt() introduction and ABI breakage require __FreeBSD_version bumping. Reported by: avg, kib, pho Reviewed by: kib Tested by: pho Revision Changes Path 1.18 +12 -1 src/share/man/man9/sleepqueue.9 1.151 +92 -13 src/sys/kern/kern_lock.c 1.72 +5 -1 src/sys/kern/kern_sx.c 1.65 +31 -4 src/sys/kern/subr_sleepqueue.c 1.3 +1 -0 src/sys/sys/_lockmgr.h 1.443 +1 -1 src/sys/sys/param.h 1.18 +1 -0 src/sys/sys/sleepqueue.h