From owner-freebsd-current@FreeBSD.ORG Mon Oct 11 19:36:32 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 394C316A4CE; Mon, 11 Oct 2004 19:36:32 +0000 (GMT) Received: from athena.softcardsystems.com (mail.softcardsystems.com [12.34.136.114]) by mx1.FreeBSD.org (Postfix) with ESMTP id C3CB743D4C; Mon, 11 Oct 2004 19:36:31 +0000 (GMT) (envelope-from sah@softcardsystems.com) Received: from athena (athena [12.34.136.114])i9BKYwfa025715; Mon, 11 Oct 2004 15:34:58 -0500 Date: Mon, 11 Oct 2004 15:34:58 -0500 (EST) From: Sam X-X-Sender: sah@athena To: Andre Oppermann In-Reply-To: <4165ADC2.70D2F0C1@freebsd.org> Message-ID: References: <4165ADC2.70D2F0C1@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed cc: freebsd-current@freebsd.org cc: freebsd-arch@freebsd.org Subject: Re: sys/net/netisr.c X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Oct 2004 19:36:32 -0000 > Sam wrote: >> >> Hello - >> >> I think I've found a bug in -- and have a question about >> the overall stability of -- sys/net/netisr.c (5.2.1 branch). > > This particular bug has already been fixed in 5.3 and 6.0-current. This bug is not fixed in 5.3-beta7 or -current. Unloading a module that unregisters a netisr leaves the queue pointer for that netisr structure page fault ready. > You should do your development on either RELENG_5 or MAIN. The > 5.2.1 branch was only a technology demo and the areas you are > concerned with have changed significantly (read: really a lot). > > RELENG_5 (5.3) will be the next stable branch which features > future binary compatibility within further > 5.x releases. > > -- > Andre > > >> My AoE module calls netisr_register on load, netisr_unregister >> on unload. Netisr_unregister fails to clear the ni->ni_queue >> pointer and the next received frame gets queued up to a page >> fault. Pretty easy to fix: >> >> --- src/sys/net/netisr.c Sat Nov 8 17:28:39 2003 >> +++ src2/sys/net/netisr.c Thu Oct 7 15:03:39 2004 >> @@ -103,6 +103,7 @@ >> ni->ni_handler = NULL; >> if (ni->ni_queue != NULL) >> IF_DRAIN(ni->ni_queue); >> + ni->ni_queue = NULL; >> } >> >> struct isrstat { >> >> Looking at the code, though, I don't see why I can't >> cause something just as ugly to happen anyway. Suppose >> the following: cpu0 starts processing an inbound frame >> while cpu1 unloads module (calling netisr_unregister). >> It *seems* possible for cpu0 to get a pointer to the >> queue, then cpu1 unload the module completely, causing >> cpu0 to page fault on the queue address. >> >> I don't claim to understand the context in which >> netisr_dispatch is called, so perhaps I'm off base, >> but shouldn't there be a mutex protecting against this? >> >> Please prove me wrong. >> >> Sam >>