Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Oct 2004 15:34:58 -0500 (EST)
From:      Sam <sah@softcardsystems.com>
To:        Andre Oppermann <andre@freebsd.org>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: sys/net/netisr.c
Message-ID:  <Pine.LNX.4.60.0410111456320.24074@athena>
In-Reply-To: <4165ADC2.70D2F0C1@freebsd.org>
References:  <Pine.LNX.4.60.0410071517280.27949@athena> <4165ADC2.70D2F0C1@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
> Sam wrote:
>>
>> Hello -
>>
>> I think I've found a bug in -- and have a question about
>> the overall stability of -- sys/net/netisr.c (5.2.1 branch).
>
> This particular bug has already been fixed in 5.3 and 6.0-current.

This bug is not fixed in 5.3-beta7 or -current.  Unloading a
module that unregisters a netisr leaves the queue pointer for that
netisr structure page fault ready.

> You should do your development on either RELENG_5 or MAIN.  The
> 5.2.1 branch was only a technology demo and the areas you are
> concerned with have changed significantly (read: really a lot).
>
> RELENG_5 (5.3) will be the next stable branch which features
> future binary compatibility within further > 5.x releases.
>
> --
> Andre
>
>
>> My AoE module calls netisr_register on load, netisr_unregister
>> on unload.  Netisr_unregister fails to clear the ni->ni_queue
>> pointer and the next received frame gets queued up to a page
>> fault.  Pretty easy to fix:
>>
>> --- src/sys/net/netisr.c        Sat Nov  8 17:28:39 2003
>> +++ src2/sys/net/netisr.c       Thu Oct  7 15:03:39 2004
>> @@ -103,6 +103,7 @@
>>          ni->ni_handler = NULL;
>>          if (ni->ni_queue != NULL)
>>                  IF_DRAIN(ni->ni_queue);
>> +       ni->ni_queue = NULL;
>>   }
>>
>>   struct isrstat {
>>
>> Looking at the code, though, I don't see why I can't
>> cause something just as ugly to happen anyway.  Suppose
>> the following: cpu0 starts processing an inbound frame
>> while cpu1 unloads module (calling netisr_unregister).
>> It *seems* possible for cpu0 to get a pointer to the
>> queue, then cpu1 unload the module completely, causing
>> cpu0 to page fault on the queue address.
>>
>> I don't claim to understand the context in which
>> netisr_dispatch is called, so perhaps I'm off base,
>> but shouldn't there be a mutex protecting against this?
>>
>> Please prove me wrong.
>>
>> Sam
>>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.60.0410111456320.24074>