From owner-freebsd-hackers@FreeBSD.ORG Wed Jul 9 06:28:41 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B2E9E37B401 for ; Wed, 9 Jul 2003 06:28:41 -0700 (PDT) Received: from mailhub.fokus.fraunhofer.de (mailhub.fokus.fraunhofer.de [193.174.154.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 84CEA43F93 for ; Wed, 9 Jul 2003 06:28:40 -0700 (PDT) (envelope-from brandt@fokus.fraunhofer.de) Received: from beagle (beagle [193.175.132.100])h69DScQ28961 for ; Wed, 9 Jul 2003 15:28:38 +0200 (MEST) Date: Wed, 9 Jul 2003 15:28:38 +0200 (CEST) From: Harti Brandt To: hackers@freebsd.org Message-ID: <20030709150708.O30571@beagle.fokus.fraunhofer.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Race in kevent X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: harti@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Jul 2003 13:28:42 -0000 Hi, I just had a crash while typing ^C to a program that has a kevent timer running. The crash was: callout_stop callout_reset filt_timerexpire softclock and callout_stop was accessing freed memory (0xdeadc0e2). After looking some time at the filt_timerdetach, callout_stop and softclock I think the following happened: Proc 1 Proc 2 ------ ------ filt_timerdetach softclock called call with Giant locked lock_spin(callout_lock) ... call callout_stop which hangs on lock_spin(callout_lock) sofclock finds the callout, removes it from its queue and clears PENDING unlock_spin(callout_lock) lock(&Giant) blocks callout_stop finds the callout to be not pending and returns filt_timerdetach frees the callout ... unlock(&Giant) softclock continues and calls the (stopped) callout KABOOM because the pointer used by filt_timerexpire is gone The problem seems to be that there is a small window where the callout is already taken off from the callout queue, but not yet called and where all locks are unlocked. callout_stop may just slip into this window and invalidate the callout softclock() is about to call as soon as it gets Giant (event with an non-MPSAFE callout the same problem exists although the window is much smaller). What to do? callout_stop already detects this situation and returns 0. As far as I understand the way to handle this is not to free the callout memory in filt_timerdetach() when callout_stop() returns 0, but let the callout be called. filt_timerexpire() should detect this situation and simply free the memory and return. Is this a possible solution? (Actually this requires some work, because the knote pointer that the filt_timerexpire() gets is probably also gone). harti -- harti brandt, http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private brandt@fokus.fraunhofer.de, harti@freebsd.org