Date: Fri, 11 Jul 2003 11:10:28 +0200 (CEST) From: Harti Brandt <brandt@fokus.fraunhofer.de> To: John Baldwin <jhb@FreeBSD.org> Cc: hackers@FreeBSD.org Subject: RE: Race in kevent Message-ID: <20030711110052.D38638@beagle.fokus.fraunhofer.de> In-Reply-To: <XFMail.20030710152212.jhb@FreeBSD.org> References: <XFMail.20030710152212.jhb@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 10 Jul 2003, John Baldwin wrote: JB> JB>On 10-Jul-2003 Harti Brandt wrote: JB>> On Wed, 9 Jul 2003, John Baldwin wrote: JB>> JB>> JB>On 09-Jul-2003 Harti Brandt wrote: JB>> JB>> JB>> JB>> Hi, JB>> JB>> JB>> JB>> I just had a crash while typing ^C to a program that has a kevent timer JB>> JB>> running. The crash was: JB>> JB>> JB>> JB>> callout_stop JB>> JB>> callout_reset JB>> JB>> filt_timerexpire JB>> JB>> softclock JB>> JB>> JB>> JB>> and callout_stop was accessing freed memory (0xdeadc0e2). After looking JB>> JB>> some time at the filt_timerdetach, callout_stop and softclock I think the JB>> JB>> following happened: JB>> JB>> JB>This is becoming a common race unfortunately. :( See the hacks in JB>> JB>msleep() that use TDF_TIMEOUT in coooperationg with endtsleep() and JB>> JB>the recent commit to the realtimer callout code for ways to work around JB>> JB>this race. JB>> JB>> In both places the thread just sleeps until the timeout has fired (when I JB>> understand this correctly). While this is a possible workaround also for JB>> kevent() (which only holds Giant as far as I can see) this is by no means JB>> a solution for other callers. While looking through the tree I have found JB>> several issues with timeouts which probably should be resolved or they JB>> will hit us with SMP: JB> JB>Yes, they sleep until the callout has finished executing. Note that the JB>callout has _already_ fired. The common case is that it is blocked on JB>the lock that the code trying to stop the callout is holding. Thus, you JB>are going to have to have special case code in your callout handler JB>_anyway_ to handle these edge cases, so there really isn't a super-duper JB>easy-clean solution. This is with MPSAFE callouts. !MPSAFE callouts will block before calling the handler when the try to acquire Giant. I'm assuming here, that the thread that calls callout_stop() must already have Giant (this is a valid assumption, isn't it?). In this case we can clear the cached copy (c_func) of the callout in callout_stop() and softclock will notice this when it has aquired Giant. We just need to make this cached copy global. For the MPSAFE case there is nothing we can do I suppose. JB>> - the CALLOUT_ACTIVE flag is not maintained correctly. softclock() fails JB>> to clear this flag after the timeout has fired. callout_stop() clears JB>> CALLOUT_ACTIVE if it finds the callout not PENDING. This is wrong if JB>> the callout is just about to be called (in this case it is !PENDING JB>> but ACTIVE). This makes callout_active() useless. JB> JB>The problem is in the API. One of the design goals is that a callout can JB>re-fire itself. Thus, softclock can't touch the callout once it has fired JB>it. This design goal is the reason for much of the confusion. JB> JB>> - using callout_active() on a callout_handle. Callouts for JB>> callout_handles (timeout(9)) are allocated from a common pool. So you may JB>> just check the wrong callout if the callout has already fired and has been JB>> reallocated to another user. Handles allocated with timeout(9) can only JB>> be passed to untimeout(9) JB> JB>The idea is that timeout(9) and untimeout(9) are a deprecated interface and JB>code should be using the callout(9) API instead. Note that timeout(9)'s can JB>never be marked MPSAFE. JB> JB>> I think we should try to make the callout interface usable without races JB>> for the !MPSAFE case (see mail from Eric Jacobs). For the MPSAFE case the JB>> caller should be responsible for this. And we should probably better JB>> document the interface. JB>> JB>> Going to think about this... JB> JB>Well, you need to consider the design goal above as it throws several JB>wrenches into the works. One possibility is that we could ditch the JB>design goal. Another possibility is that we could expand the callout JB>API to allow for periodict callouts and not just one-shot callouts. That would probably be a good idea. Scheduling the new callout before calling the callout routine would be a good idea. harti -- harti brandt, http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private brandt@fokus.fraunhofer.de, harti@freebsd.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030711110052.D38638>