Date: Wed, 19 Jul 2000 09:59:22 -0400 (EDT) From: Bosko Milekic <bmilekic@dsuper.net> To: Alfred Perlstein <bright@wintelcom.net> Cc: net@FreeBSD.ORG Subject: Re: kern/19866: The mbuf subsystem refcount stuff. Message-ID: <Pine.BSF.4.21.0007190941260.40059-100000@jehovah.technokratis.com> In-Reply-To: <20000718184511.O13979@fw.wintelcom.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 18 Jul 2000, Alfred Perlstein wrote: > moved over to -net. [...] > Ok here's an idea that I was tossing around: > > you have a freelist of these guys, one for each mbuf header, > you can make less and deal with failure by sleeping or failing > as you want, but for the sake of example one is available for > each mbuf at any given time. > > struct mclrefcnt { > struct mclrefcnt *mextr_next; /* XXX: use list macros :) */ > atomic_t mextr_refcnt; > }; Okay, following our discussion yesterday, I can understand why your suggestion is valid, especially given the brief SMP outline and future changes that are to come. There is something that I recently thought of, following a brief beginning in modifying and implementing this, though: the mclrefcnt/mextrefcnt/whatever we want to call it is on a singly linked list to which additions and removes will be made via the head. Therefore, one will have to splimp(), or lock multiple CPUs in the SMP case, when tampering with this list, and this would be during certain reference increments. It's not as bad as the linked list case, which will have to do this during every single additional reference, because it will only need to do it during the first reference call, which will probably come from the allocation macro, which will be under splimp() (or SMP equivalent multiple CPU lock) anyway. Correct? Also, I'm curious at this point as to how some of this stuff is done in BSD/OS, but I don't have access/can't see the source. Any pointers/explanations/brief outlines? I'm mainly curious as to the SMP stuff vs. mbuf pools, whether they are per CPU, etc. Alfred, now that I think about it, I would eventually like to try the per-CPU list idea, to see how well it will scale on an SMP machine with relatively heavy network load, and I'll probably want to do this before I finalize the resource freeing of mbuf-used pages outline. > and a pointer to one in each mbuf. > > when you want to make a copy your code looks something like this: > > /* copying m into n */ > > if (m->rp == NULL && n->rp == NULL) { > m->rp = n->rp = mclrefcnt_alloc(); > } else if (m->rp == NULL) { > m->rp = n->rp; > } else if (n->rp == NULL) { > n->rp = m->rp; > } else { > mclrefcnt_free(n->rp); > n->rp = m->rp; > } > atomic_inc(m->rp.mextr_refcnt); > > > /* freeing m */ > > /* x must get the value I reduced it to */ > x = atomic_dec_and_fetch(m->rp.mextr_refcnt); > if (x == 0) { > /* do extfree callback */ > } else { > m->rp = NULL; > } > /* free mbuf header */ > > Are there problems you see with that? Not by just glancing at it, although the copying of the mbuf code above can be more concise, as you know you're copying say m->m_ext into n->m_ext and incrementing the ref count, so it's not really that tough, as far as I can see right now. To "copy m into n" you just do something like: n->m_ext = m->m_ext; n->m_ext->mextrefcnt++; ...and you're on your way. As we discussed this, you don't have to worry about something else freeing a reference and consequently freeing the mextrefcnt node and ext_buf, because you know that here you have control of both m and n, both of which refer to the same ext_buf, so the refcnt cannot drop below 1, while you're going to atomically increment. You mentionned a race condition in the free code above, yesterday, but I don't exactly recall what. If the refcnt reaches zero above though, you'll have to free the mextrefcnt/mclrefcnt/whatever structure too, at least back to its respective free list. Note that if in the SMP case we decide to split up the free lists, this is one list that will have to remain global, as we may have to deal with mbufs from different CPU free lists referring the same ext_buf's as some allocated from different CPU mbuf free lists. > -- > -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] > "I have the heart of a child; I keep it in a jar on my desk." Cheers, Bosko. -- Bosko Milekic * Voice/Mobile: 514.865.7738 * Pager: 514.921.0237 bmilekic@technokratis.com * http://www.technokratis.com/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0007190941260.40059-100000>