From owner-freebsd-stable@FreeBSD.ORG  Mon Mar 17 20:11:29 2008
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E713A106564A;
	Mon, 17 Mar 2008 20:11:29 +0000 (UTC)
	(envelope-from bright@elvis.mu.org)
Received: from elvis.mu.org (elvis.mu.org [192.203.228.196])
	by mx1.freebsd.org (Postfix) with ESMTP id D68498FC1A;
	Mon, 17 Mar 2008 20:11:29 +0000 (UTC)
	(envelope-from bright@elvis.mu.org)
Received: by elvis.mu.org (Postfix, from userid 1192)
	id AE7C51A4D84; Mon, 17 Mar 2008 13:10:14 -0700 (PDT)
Date: Mon, 17 Mar 2008 13:10:14 -0700
From: Alfred Perlstein <alfred@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <20080317201014.GA67856@elvis.mu.org>
References: <20080315024114.GD67856@elvis.mu.org>
	<200803171127.20561.jhb@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <200803171127.20561.jhb@freebsd.org>
User-Agent: Mutt/1.4.2.3i
Cc: stable@freebsd.org, freebsd-smp@freebsd.org
Subject: Re: timeout/untimeout race conditions/crash [patch]
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Mar 2008 20:11:30 -0000

* John Baldwin <jhb@freebsd.org> [080317 09:43] wrote:
> 
> This is not a bug.  Don't use untimeout(9) as it is not guaranteed to be 
> reliable.  Instead, use callout_*().  Your patch doesn't solve any races as 
> the driver detach routine needs to use callout_drain() and not just 
> callout_stop/untimeout anyways.  Fix your broken drivers.

I understand that some old Giant locked code can issue timeout/untimeout
without Giant held, which would certainly cause this issue to happen
and is uncorrectable, however, this is with completely Giant locked
code.

We are not trying to use timeout(9) for mpsafe code, this is old
code and relies upon Giant. 

Giant locked code should be timeout/untimeout safe.  As explained
in my email, there exists a condition where the Giant locked code
can have a timer fire even though proper Giant locking is observed.

For a Giant locked subsystem, one should be able to have the following
code work:

mtx_lock(&Giant);	/* formerly spl higher than softclock */
untimeout(&func, arg, &sc->handle);
free(sc);
mtx_unlock(&Giant);	/* formerly splx() */

Normally splsoftclock would completely block the timeout from firing
and this sort of code would be safe.  It is no longer safe due to
a BUG in the way that Giant is used.

Please reread the original mail to better understand the synopsis
of the problem.

thank you,
-Alfred