From owner-cvs-all@FreeBSD.ORG Fri Jan 25 14:39:27 2008 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7562716A418; Fri, 25 Jan 2008 14:39:27 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 6DCE713C458; Fri, 25 Jan 2008 14:39:27 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from zion.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by elvis.mu.org (Postfix) with ESMTP id BB9331A3C1A; Fri, 25 Jan 2008 06:35:26 -0800 (PST) From: John Baldwin To: src-committers@freebsd.org Date: Fri, 25 Jan 2008 07:39:41 -0500 User-Agent: KMail/1.9.7 References: <200801250209.m0P29cjL050767@repoman.freebsd.org> In-Reply-To: <200801250209.m0P29cjL050767@repoman.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200801250739.41413.jhb@freebsd.org> Cc: cvs-src@freebsd.org, cvs-all@freebsd.org Subject: Re: cvs commit: src/sys/kern subr_sleepqueue.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2008 14:39:27 -0000 On Thursday 24 January 2008 09:09:38 pm John Baldwin wrote: > jhb 2008-01-25 02:09:38 UTC > > FreeBSD src repository > > Modified files: > sys/kern subr_sleepqueue.c > Log: > Fix a race in the sleepqueue timeout code that resulted in sleeps not > being properly cancelled by a timeout. In general there is a race > between a the sleepq timeout handler firing while the thread is still > in the process of going to sleep. In 6.x with sched_lock, the race was > largely protected by sched_lock. The only place it was "exposed" and had > to be handled was while checking for any pending signals in > sleepq_catch_signals(). > > With the thread lock changes, the thread lock is dropped in between > sleepq_add() and sleepq_*wait*() opening up a new window for this race. > Thus, if the timeout fired while the sleeping thread was in between > sleepq_add() and sleepq_*wait*(), the thread would be marked as timed > out, but the thread would not be dequeued and sleepq_switch() would > still block the thread until it was awakened via some other means. In > the case of pause(9) where there is no other wakeup, the thread would > never be awakened. > > Fix this by teaching sleepq_switch() to check if the thread has had its > sleep canceled before blocking by checking the TDF_TIMEOUT flag and > aborting the sleep and dequeueing the thread if it is set. > > MFC after: 3 days > Reported by: dwhite, peter This should fix the "vmo_de" hangs some people have reported on 7.x+. -- John Baldwin