From owner-freebsd-hackers@FreeBSD.ORG Thu Jul 10 12:22:02 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1492837B404 for ; Thu, 10 Jul 2003 12:22:02 -0700 (PDT) Received: from mail.speakeasy.net (mail12.speakeasy.net [216.254.0.212]) by mx1.FreeBSD.org (Postfix) with ESMTP id ECCBC43F3F for ; Thu, 10 Jul 2003 12:22:00 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 5756 invoked from network); 10 Jul 2003 19:22:00 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender )encrypted SMTP for ; 10 Jul 2003 19:22:00 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.9/8.12.9) with ESMTP id h6AJLwGI002532; Thu, 10 Jul 2003 15:21:58 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20030710103146.R30571@beagle.fokus.fraunhofer.de> Date: Thu, 10 Jul 2003 15:22:12 -0400 (EDT) From: John Baldwin To: harti@FreeBSD.org cc: hackers@FreeBSD.org Subject: RE: Race in kevent X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jul 2003 19:22:02 -0000 On 10-Jul-2003 Harti Brandt wrote: > On Wed, 9 Jul 2003, John Baldwin wrote: > > JB>On 09-Jul-2003 Harti Brandt wrote: > JB>> > JB>> Hi, > JB>> > JB>> I just had a crash while typing ^C to a program that has a kevent timer > JB>> running. The crash was: > JB>> > JB>> callout_stop > JB>> callout_reset > JB>> filt_timerexpire > JB>> softclock > JB>> > JB>> and callout_stop was accessing freed memory (0xdeadc0e2). After looking > JB>> some time at the filt_timerdetach, callout_stop and softclock I think the > JB>> following happened: > > JB>This is becoming a common race unfortunately. :( See the hacks in > JB>msleep() that use TDF_TIMEOUT in coooperationg with endtsleep() and > JB>the recent commit to the realtimer callout code for ways to work around > JB>this race. > > In both places the thread just sleeps until the timeout has fired (when I > understand this correctly). While this is a possible workaround also for > kevent() (which only holds Giant as far as I can see) this is by no means > a solution for other callers. While looking through the tree I have found > several issues with timeouts which probably should be resolved or they > will hit us with SMP: Yes, they sleep until the callout has finished executing. Note that the callout has _already_ fired. The common case is that it is blocked on the lock that the code trying to stop the callout is holding. Thus, you are going to have to have special case code in your callout handler _anyway_ to handle these edge cases, so there really isn't a super-duper easy-clean solution. > - the CALLOUT_ACTIVE flag is not maintained correctly. softclock() fails > to clear this flag after the timeout has fired. callout_stop() clears > CALLOUT_ACTIVE if it finds the callout not PENDING. This is wrong if > the callout is just about to be called (in this case it is !PENDING > but ACTIVE). This makes callout_active() useless. The problem is in the API. One of the design goals is that a callout can re-fire itself. Thus, softclock can't touch the callout once it has fired it. This design goal is the reason for much of the confusion. > - using callout_active() on a callout_handle. Callouts for > callout_handles (timeout(9)) are allocated from a common pool. So you may > just check the wrong callout if the callout has already fired and has been > reallocated to another user. Handles allocated with timeout(9) can only > be passed to untimeout(9) The idea is that timeout(9) and untimeout(9) are a deprecated interface and code should be using the callout(9) API instead. Note that timeout(9)'s can never be marked MPSAFE. > I think we should try to make the callout interface usable without races > for the !MPSAFE case (see mail from Eric Jacobs). For the MPSAFE case the > caller should be responsible for this. And we should probably better > document the interface. > > Going to think about this... Well, you need to consider the design goal above as it throws several wrenches into the works. One possibility is that we could ditch the design goal. Another possibility is that we could expand the callout API to allow for periodict callouts and not just one-shot callouts. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/