Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Jul 2000 16:00:37 -0700
From:      Alfred Perlstein <bright@wintelcom.net>
To:        Bosko Milekic <bmilekic@dsuper.net>
Cc:        David Malone <dwmalone@maths.tcd.ie>, freebsd-bugs@freebsd.org, iedowse@maths.tcd.ie, sheldonh@freebsd.org, jlemon@freebsd.org, wollman@freebsd.org
Subject:   Re: kern/19866: The mbuf subsystem refcount stuff.
Message-ID:  <20000718160033.H13979@fw.wintelcom.net>
In-Reply-To: <Pine.BSF.4.21.0007181551160.36727-100000@jehovah.technokratis.com>; from bmilekic@dsuper.net on Tue, Jul 18, 2000 at 04:05:38PM -0400
References:  <200007182016.aa47290@salmon.maths.tcd.ie> <Pine.BSF.4.21.0007181551160.36727-100000@jehovah.technokratis.com>

next in thread | previous in thread | raw e-mail | index | archive | help
* Bosko Milekic <bmilekic@dsuper.net> [000718 13:03] wrote:
> 
> On Tue, 18 Jul 2000, David Malone wrote:
> 
> > As far as I can make out, using the linked list to track mbuf
> > references is not an unreasonable idea. NetBSD have been using it
> > since 1997, see revision 1.23 of:
> > 
> > http://www.FreeBSD.org/cgi/cvsweb.cgi/syssrc/sys/sys/mbuf.h?cvsroot=netbsd
> > 
> > for details of when it was introduced. That's not to say there
> > isn't a better system, but we have evidence this system can work.
> 
> 	Yes, there is absolutely NO problem with it as far as I can tell.
>   Alfred's argument to not introducing it is that you would need to
>   splimp() during MEXT_ADD_REF and MEXT_REM_REF; which is obviously true
>   but not significant, as removing and adding references -- even with the
>   old counter -- required that. The reason is that if you look at it from a
>   general side, and consider that you are incrementing the reference
>   counter for an isolated mbuf's ext_obj, and that ext_obj is referenced
>   by another (un-related) mbuf, you may be breaking things if you are not
>   under splimp(), even if the incrementing is an atomic operation. The
>   reason is that while computing the index into the counter array an
>   interrupt may occur which may end up dereferencing the counter by 1 and
>   if previously that counter was at 1, then the ext_obj/ext_buf/whatever
>   you want to call it will be freed and attached to the free list and you
>   will end up incrementing the ref count to 1 again even though the buffer
>   is now sitting on the free list.

That never happens.  If you have an mbuf it's _yours_ no one will
deref it out from under you otherwise it wouldn't work at all. You
can NOT pass an mbuf you some subsystem and then free it, you must
pass a _copy_ of it if you want to use it again later (attach
another mbuf header and increase the ref count), therefor the
refcount will be 1 higher and even if it's free()'d at intr you'll
never go to 0.

>   	Conclusion: you have to splimp() while adding or removing a
>   reference. Maybe not in the GENERAL case, but it doesn't matter, as it's
>   not always the GENERAL case that will break things. And even if you
>   isolate the calculation of the index and store it into a variable, thus
>   probably making the incrementing more or less atomic, you'll still have
>   to make sure that the buffer isn't being freed when you are doing that,
>   because if you end up incrementing from 0 to 1 for an ext_buf which *you*
>   haven't allocated (i.e. there has to be another referrer), then you're
>   bound to run into serious trouble.
> 
>   	Basically, as usual (sigh), I am open to more modifications, but the
>   impression that I've gotten from recent discussion with Alfred is that
>   this is worthless, "will have to be changed anyway" (which I doubt, and
>   if which is the case, should be done ONLY to accomodate the SMP stuff),
>   and that it should not be committed at this point.
>   	I personally don't have a problem with that, as I value Alfred's
>   opinion (despite the fact that I disagree with it) and would think that
>   he probably has good reason for it, as he's on top of SMP things (as far
>   as I can tell), and not probably, but certainly, knows better. :-)

The reason is to hopefully avoid locking the entire pool or locking
3 objects to make a single cluster copy.  I'm not saying that you 
have to make 3 locks, but I can't see it any other way:
  myself, backward, forward -> manip pointers

>  
>  	Anyway, the other issue is that, obviously, I have some other work in
>   progress in this area of the tree and I am forced to halt it not only
>   because I have to deal with work already done but not yet committed, but
>   also because I'm trying to understand the philosophy behind what is
>   wanted and what isn't.
> 
> > If this seems like a reasonable plan, I can work with Bosko to extract
> > step 1 from his patches.
> 
> 	It would be my pleasure to (again) work with you.

Actually, I would like to see it go in, the macro approach will
make it easy to move it over the pointer to refcount memory idea,
I'd just hoped that you would have done it my way.  Your work
will take us halfway there and be very benificial in the meanwhile
by reducing the amount of callbacks needed when sending clusters
as well as simplifying a lot of code and special cases.

Lastly, I'd like to apologize for my behavior on IRC, it was uncalled
for, this is some highly needed and quality work and I wanted to 
thank you for taking the time to do it.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000718160033.H13979>