From owner-freebsd-arch@FreeBSD.ORG Fri Dec 28 09:03:09 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A598816A418 for ; Fri, 28 Dec 2007 09:03:09 +0000 (UTC) (envelope-from hselasky@c2i.net) Received: from swip.net (mailfe14.swipnet.se [212.247.155.161]) by mx1.freebsd.org (Postfix) with ESMTP id E3D9013C448 for ; Fri, 28 Dec 2007 09:03:07 +0000 (UTC) (envelope-from hselasky@c2i.net) X-Cloudmark-Score: 0.000000 [] Received: from [193.217.102.3] (account mc467741@c2i.net HELO [10.0.0.249]) by mailfe14.swip.net (CommuniGate Pro SMTP 5.1.13) with ESMTPA id 11653783; Fri, 28 Dec 2007 10:03:06 +0100 From: Hans Petter Selasky To: freebsd-arch@freebsd.org Date: Fri, 28 Dec 2007 10:03:50 +0100 User-Agent: KMail/1.9.7 References: <18378.1196596684@critter.freebsd.dk> <4752AABE.6090006@freebsd.org> <200712271805.40972.jhb@freebsd.org> In-Reply-To: <200712271805.40972.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200712281003.52062.hselasky@c2i.net> Cc: Andre Oppermann , Attilio Rao , arch@freebsd.org, Poul-Henning Kamp , Robert Watson Subject: Re: New "timeout" api, to replace callout X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Dec 2007 09:03:09 -0000 On Friday 28 December 2007, John Baldwin wrote: > On Sunday 02 December 2007 07:53:18 am Andre Oppermann wrote: > > Poul-Henning Kamp wrote: > > > In message <4752998A.9030007@freebsd.org>, Andre Oppermann writes: > > >> o TCP puts the timer into an allocated structure and upon close of > > >> the session it has to be deallocated including stopping of all > > >> currently running timers. > > >> [...] > > >> -> The timer facility should provide an atomic stop/remove call > > >> that prevent any further callbacks upon return. It should not > > >> do a 'drain' where the callback may be run anyway. > > >> Note: We hold the lock the callback would have to obtain. > > > > > > It is my intent, that the implementation behind the new API will > > > only ever grab the specified lock when it calls the timeout function. > > > > This is the same for the current one and pretty much a given. > > > > > When you do a timeout_disable() or timeout_cleanup() you will be > > > sleeping on a mutex internal to the implementation, if the timeout > > > is currently executing. > > > > This is the problematic part. We can't sleep in TCP when cleaning up > > the timer. We're not always called from userland but from interrupt > > context. And when calling the cleanup we currently hold the lock the > > callout wants to obtain. We can't drop it either as the race would > > be back again. What you describe here is the equivalent of callout_ > > drain(). This is unfortunately unworkable in TCP's context. The > > callout has to go away even if it is already pending and waiting on > > the lock. Maybe that can only be solved by a flag in the lock saying > > "give up and go away". > > The reason you need to do a drain is to allow for safe destroying of the > lock. Specifically, drivers tend to do this: > > FOO_LOCK(sc); > ... > callout_stop(...); > FOO_UNLOCK(sc); > ... > callout_drain(...); > ... > mtx_destroy(&sc->foo_mtx); > > If you don't have the drain and softclock is trying to acquire the backing > mutex while you have it held (before the callout_stop) then Bad Things can > happen if you don't do the drain. Having the lock just "give up" doesn't > work either because if the memory containing the lock is free'd and > reinitialized such that it looks enough like a valid lock then softclock > (or its equivalent) will still try to obtain it. Also, you need to do a > drain so it is safe to free the callout structure to prevent it from being > recycled and having weird races where it gets recycled and rescheduled but > the timer code thinks it has a pending stop for that pointer and so it > aborts the wrong instance of the timer, etc. Hi, I completely agree to what John Baldwin is writing. You need two stop-functions: xxx_stop which is non-blocking and xxx_drain which can block i.e. sleep BTW: The USB code in P4 uses the same semantics, due to the same reasons: usbd_transfer_stop() and usbd_transfer_drain() The only difference is that I pass an error code to the callback which might happen after that usbd_transfer_stop is called. I think that xxx_stop() and xxx_drain() is a generic approach that should be applied to all callback systems. Whenever you have a callback you need to be able to stop it and drain it. --HPS