From owner-freebsd-arch@FreeBSD.ORG  Fri Dec 28 10:30:15 2007
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8DBCD16A41A;
	Fri, 28 Dec 2007 10:30:15 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 4783813C4E3;
	Fri, 28 Dec 2007 10:30:15 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [209.31.154.41])
	by cyrus.watson.org (Postfix) with ESMTP id E705746EB6;
	Fri, 28 Dec 2007 05:30:14 -0500 (EST)
Date: Fri, 28 Dec 2007 10:30:14 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Hans Petter Selasky <hselasky@c2i.net>
In-Reply-To: <200712281003.52062.hselasky@c2i.net>
Message-ID: <20071228102544.J45653@fledge.watson.org>
References: <18378.1196596684@critter.freebsd.dk>
	<4752AABE.6090006@freebsd.org> <200712271805.40972.jhb@freebsd.org>
	<200712281003.52062.hselasky@c2i.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Andre Oppermann <andre@freebsd.org>, Attilio Rao <attilio@freebsd.org>,
	arch@freebsd.org, Poul-Henning Kamp <phk@phk.freebsd.dk>,
	freebsd-arch@freebsd.org
Subject: Re: New "timeout" api, to replace callout
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Dec 2007 10:30:15 -0000


On Fri, 28 Dec 2007, Hans Petter Selasky wrote:

>> The reason you need to do a drain is to allow for safe destroying of the
>> lock. Specifically, drivers tend to do this:
>>
>> 	FOO_LOCK(sc);
>> 	...
>> 	callout_stop(...);
>> 	FOO_UNLOCK(sc);
>> 	...
>> 	callout_drain(...);
>> 	...
>> 	mtx_destroy(&sc->foo_mtx);
>>
>> If you don't have the drain and softclock is trying to acquire the backing 
>> mutex while you have it held (before the callout_stop) then Bad Things can 
>> happen if you don't do the drain.  Having the lock just "give up" doesn't 
>> work either because if the memory containing the lock is free'd and 
>> reinitialized such that it looks enough like a valid lock then softclock 
>> (or its equivalent) will still try to obtain it.  Also, you need to do a 
>> drain so it is safe to free the callout structure to prevent it from being 
>> recycled and having weird races where it gets recycled and rescheduled but 
>> the timer code thinks it has a pending stop for that pointer and so it 
>> aborts the wrong instance of the timer, etc.
>
> I completely agree to what John Baldwin is writing. You need two 
> stop-functions:
>
> xxx_stop which is non-blocking and xxx_drain which can block i.e. sleep
>
> BTW: The USB code in P4 uses the same semantics, due to the same reasons:
>
> usbd_transfer_stop() and usbd_transfer_drain()
>
> The only difference is that I pass an error code to the callback which might 
> happen after that usbd_transfer_stop is called.
>
> I think that xxx_stop() and xxx_drain() is a generic approach that should be 
> applied to all callback systems. Whenever you have a callback you need to be 
> able to stop it and drain it.

I think the argument that Poul-Henning is making is not that you don't need 
something that behaves "like drain", but rather, we're like the wait for drain 
to be a short, mutex-length wait rather than a long, msleep-length wait. 
Remember that the bodies of callouts are expected to run in a very short 
period of time in order to not stall the timer system, in fact, in such a way 
that a mutex could be held over the entirely timeout call.  Given that this is 
the case, one might reasonably expect callout_stop() to perform the drain 
rather than having a separate call.  Such a model would be very advantageous 
in TCP, where rather than having to defer GC'ing the inpcb/tcpcb to a GC 
worker thread, which we don't do but should, we could use the stop call safely 
and eliminate a whole class of races from the stack.

Robert N M Watson
Computer Laboratory
University of Cambridge