Date: Sun, 02 Dec 2007 12:39:54 +0100 From: Andre Oppermann <andre@freebsd.org> To: Poul-Henning Kamp <phk@phk.freebsd.dk> Cc: Attilio Rao <attilio@FreeBSD.org>, arch@FreeBSD.org, Robert Watson <rwatson@FreeBSD.org> Subject: Re: New "timeout" api, to replace callout Message-ID: <4752998A.9030007@freebsd.org> In-Reply-To: <18129.1196593277@critter.freebsd.dk> References: <18129.1196593277@critter.freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
Poul-Henning Kamp wrote: > In message <20071202103833.N74097@fledge.watson.org>, Robert Watson writes: >> On Sun, 2 Dec 2007, Poul-Henning Kamp wrote: > >>> I have no idea what the answer to your question is, I'm focusing on >>> providing the ability, how we subsequently decide to use it is up to others. >> Well, I think there is an important question to be discussed regarding >> combinatorics, context switching, and the ability to provide multiple callout >> threads. > > I still have no way to answer those questions. > > My aim here is to provide and implement an client API that will let > us play with all those things. > > There are 444 .c or .h files in my src/sys which contains the word > "callout". > > Obviously, getting the API right, so that we will not have to walk > all these files once again is a very important point, and the only > one I am trying to focus on right now. For TCP the following features/properties would make the implementation much easier: o TCP maintains a number of concurrent, but hierarchical timers for each session. What predominantly happens is a reschedule of an existing timer, that means it wasn't close to firing and is moved out again. This happens for every incoming segment. -> The timer facility should make it simple and efficient to move the deadline into the future. o TCP puts the timer into an allocated structure and upon close of the session it has to be deallocated including stopping of all currently running timers. At the moment this is not really possible as callout_stop() is not atomic and the callout may already be waiting to be run on a lock. At the moment we just live with this race condition, apply some bandages and pray. Since this only happens on close and deallocation the operation may be more expensive than a normal timer stop call. Race conditions on normal timeout stops like stopping the delack timer are acceptable and can easily be handled with TCP. If it shows up after it was stopped we see it and just return. -> The timer facility should provide an atomic stop/remove call that prevent any further callbacks upon return. It should not do a 'drain' where the callback may be run anyway. Note: We hold the lock the callback would have to obtain. o TCP has hot and cold CPU/cache affinity. For certain timers we want to stay on the same CPU as it is very likely to still have the tcp control block in cache. The delayed ACK timer is the prime example running on some 100ms deadline. On the other hand timeouts farther away like the keepalive timer do not matter as there is almost zero chance that any CPU has it still around. Note: When we get NIC->CPU affinity we may want to keep all timeouts of a particular session always on the same CPU. -> The timer facility should provide strong, weak and "don't care" CPU affinity. The affinity should be selected for a timer as whole, not upon each call. o TCP's data structure is exported to userspace and contains the timeout data structures. This complicates timeout handling as the data structure is not known to userland and we have to do some hacks to prevent exposure. -> The timer facility should provide an opaque userland compat header definition. -- Andre
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4752998A.9030007>