From owner-freebsd-arch@FreeBSD.ORG Thu Dec 27 23:21:54 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B585D16A420; Thu, 27 Dec 2007 23:21:54 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 2C29D13C45B; Thu, 27 Dec 2007 23:21:54 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8q) with ESMTP id 226289019-1834499 for multiple; Thu, 27 Dec 2007 18:24:05 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id lBRNLf8H054109; Thu, 27 Dec 2007 18:21:47 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-arch@FreeBSD.org Date: Thu, 27 Dec 2007 18:05:40 -0500 User-Agent: KMail/1.9.6 References: <18378.1196596684@critter.freebsd.dk> <4752AABE.6090006@freebsd.org> In-Reply-To: <4752AABE.6090006@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200712271805.40972.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 27 Dec 2007 18:21:48 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/5270/Thu Dec 27 12:48:18 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Attilio Rao , arch@FreeBSD.org, Poul-Henning Kamp , Robert Watson , Andre Oppermann Subject: Re: New "timeout" api, to replace callout X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2007 23:21:54 -0000 On Sunday 02 December 2007 07:53:18 am Andre Oppermann wrote: > Poul-Henning Kamp wrote: > > In message <4752998A.9030007@freebsd.org>, Andre Oppermann writes: > >> o TCP puts the timer into an allocated structure and upon close of the > >> session it has to be deallocated including stopping of all currently > >> running timers. > >> [...] > >> -> The timer facility should provide an atomic stop/remove call > >> that prevent any further callbacks upon return. It should not > >> do a 'drain' where the callback may be run anyway. > >> Note: We hold the lock the callback would have to obtain. > > > > It is my intent, that the implementation behind the new API will > > only ever grab the specified lock when it calls the timeout function. > > This is the same for the current one and pretty much a given. > > > When you do a timeout_disable() or timeout_cleanup() you will be > > sleeping on a mutex internal to the implementation, if the timeout > > is currently executing. > > This is the problematic part. We can't sleep in TCP when cleaning up > the timer. We're not always called from userland but from interrupt > context. And when calling the cleanup we currently hold the lock the > callout wants to obtain. We can't drop it either as the race would > be back again. What you describe here is the equivalent of callout_ > drain(). This is unfortunately unworkable in TCP's context. The > callout has to go away even if it is already pending and waiting on > the lock. Maybe that can only be solved by a flag in the lock saying > "give up and go away". The reason you need to do a drain is to allow for safe destroying of the lock. Specifically, drivers tend to do this: FOO_LOCK(sc); ... callout_stop(...); FOO_UNLOCK(sc); ... callout_drain(...); ... mtx_destroy(&sc->foo_mtx); If you don't have the drain and softclock is trying to acquire the backing mutex while you have it held (before the callout_stop) then Bad Things can happen if you don't do the drain. Having the lock just "give up" doesn't work either because if the memory containing the lock is free'd and reinitialized such that it looks enough like a valid lock then softclock (or its equivalent) will still try to obtain it. Also, you need to do a drain so it is safe to free the callout structure to prevent it from being recycled and having weird races where it gets recycled and rescheduled but the timer code thinks it has a pending stop for that pointer and so it aborts the wrong instance of the timer, etc. -- John Baldwin