From owner-freebsd-hackers@FreeBSD.ORG  Wed Jul  9 14:21:08 2003
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9069037B401
	for <freebsd-hackers@freebsd.org>;
	Wed,  9 Jul 2003 14:21:08 -0700 (PDT)
Received: from smtp.goamerica.net (ny-mx-01.goamerica.net [208.200.67.108])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A6DC943F75
	for <freebsd-hackers@freebsd.org>;
	Wed,  9 Jul 2003 14:21:07 -0700 (PDT)	(envelope-from eaja@erols.com)
Received: from localhost (165.sub-166-141-30.myvzw.com [166.141.30.165])
	by smtp.goamerica.net (8.12.8/8.12.8) with SMTP id h69LKFNf007124;
	Wed, 9 Jul 2003 17:20:17 -0400 (EDT)
Date: Wed, 9 Jul 2003 17:19:40 -0400
From: Eric Jacobs <eaja@erols.com>
To: freebsd-hackers@freebsd.org
Message-Id: <20030709171940.1366084b.eaja@erols.com>
In-Reply-To: <20030709150708.O30571@beagle.fokus.fraunhofer.de>
References: <20030709150708.O30571@beagle.fokus.fraunhofer.de>
X-Mailer: Sylpheed version 0.8.5 (GTK+ 1.2.10; i386-portbld-freebsd4.2)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Subject: Re: Race in kevent
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jul 2003 21:21:09 -0000

On Wed, 9 Jul 2003 15:28:38 +0200 (CEST)
Harti Brandt <brandt@fokus.fraunhofer.de> wrote:

> Hi,
> 
> I just had a crash while typing ^C to a program that has a kevent timer
> running. The crash was:
> 
> callout_stop
> callout_reset
> filt_timerexpire
> softclock
> 
> and callout_stop was accessing freed memory (0xdeadc0e2). After looking
> some time at the filt_timerdetach, callout_stop and softclock I think the
> following happened:

I've had a very similar problem with the timeouts being stopped incorrectly
when my xl-based Cardbus NIC is removed. I wrote a post about it, but didn't
get any responses.

http://lists.freebsd.org/pipermail/freebsd-hackers/2003-June/001531.html

My suspicion is that this is not a problem with the xl or kevent code which
are clients of the timeout interface, but a deficiency in the timeout
mechanism itself.

> 
> 
> Proc 1                                  Proc 2
> ------                                  ------
> filt_timerdetach			softclock called
> call with Giant locked
> 
> 					lock_spin(callout_lock)
> 					...
> call callout_stop which hangs on
> lock_spin(callout_lock)
> 
> 					sofclock finds the callout,
> 					removes it from its queue and
> 					clears PENDING
> 
> 					unlock_spin(callout_lock)
> 					lock(&Giant) blocks
> 
> callout_stop finds the callout to
> be not pending and returns
> 
> filt_timerdetach frees the callout
> 
> ...
> 
> unlock(&Giant)
> 					softclock continues and calls
> 					the (stopped) callout
> 
> 					KABOOM because the pointer used
> 					by filt_timerexpire is gone
> 
> 
> The problem seems to be that there is a small window where the callout is
> already taken off from the callout queue, but not yet called and where all
> locks are unlocked. callout_stop may just slip into this window and
> invalidate the callout softclock() is about to call as soon as it gets
> Giant

Yup, that was my analysis of what happened with my problem as well.

> (event with an non-MPSAFE callout the same problem exists although
> the window is much smaller).

Actually, I'm not certain that improves anything at all. The timeout
routine would probably utilize a finer-grained lock which can cause
the context-switch and the same problem.

> What to do?
> 
> callout_stop already detects this situation and returns 0. As far as I
> understand the way to handle this is not to free the callout memory in
> filt_timerdetach() when callout_stop() returns 0, but let the callout be
> called. filt_timerexpire() should detect this situation and simply free
> the memory and return. Is this a possible solution? (Actually this
> requires some work, because the knote pointer that the filt_timerexpire()
> gets is probably also gone).

Unfortunately I do not think this is simply a memory-management issue.
The essential logical problem is that the timeout function is being
called after it has been removed. It may try to reference data structures
that have been freed with the expectation that the timeout function will
no longer be called.

I didn't think of it in my original post, but perhaps we need a 
"thissoftcheck" pointer that works analogously to "nextsoftcheck",
except that instead of being advanced to the next entry in the queue,
it is simply zeroed out when the entry is removed. softclock() could
detect this pointer being zeroed out just before it goes to call the
callout or timeout function and skip that invocation if that is the
case.

This is definitely not a solution in the CALLOUT_MPSAFE case, however,
because it would make no sense to try to verify this pointer in the
unprotected area between the spin lock being dropped and the sleep
lock being picked up. In the !CALLOUT_MPSAFE case, we know what the
sleep mutex would be -- it is always Giant -- and so we can test the
pointer after that point.

I am still not certain I am thinking clearly about this.

Thoughts?

Eric