Date: Tue, 2 Jul 2002 00:10:50 -0700 From: Luigi Rizzo <rizzo@icir.org> To: net@freebsd.org Subject: Mbuf allocator performance (was Should we keep a cache of mbuf+cluster ready for use ?) Message-ID: <20020702001050.B2250@iguana.icir.org>
next in thread | raw e-mail | index | archive | help
[Bcc to -current as relevant there]
As a followup to my question below, i did some simple experiment
on a -current box acting as a router, using two "em" cards and
DEVICE_POLLING (just for stability).
The whole of the code is basically below -- the receiving
side tries to grab the mbuf+cluster from the free pool first,
and falls back to the standard method on failure; the sending
side tries to attach the freed buffer to the free pool
if it is the case.
There is a simplification here with respect to what could
go into an m_freem()/mgethdr_cluster() pair because the driver
is already holding a lock, on Giant in this case, so you do not need
further locking. Still, to come to the data points:
CURRENT STABLE (*)
fastfwd=1, pool_max=0 276kpps 365 kpps
fastfwd=1, pool_max=50 355kpps 383 kpps
fastfwd=0, pool_max=0 195 pps 142 kpps
fastfwd=0, pool_max=50 227 kpps 146 kpps
(*) This version of STABLE that I am using for comparison has some
proprietary optimizations which make it a bit faster than normal.
However it still uses the old ipfw code, which is invoked when
fastfwd=0, and is significantly slower than the new one.
Now this really seems to call for adding this interface into the
mbuf subsystem. I believe we have to find a name for the allocator
(the deallocator might well go into m_freem(), depending on how we
implement the locking) and whether it makes sense to lock
mgethdr_cluster() under per-cpu locks or under Giant, or even let
the caller make sure that it holds the proper lock before trying
to invoke the procedure (as i expect the "producers" or "consumers"
of these pairs to be located in the network stack, chances are that
they already hold a lock on Giant).
cheers
luigi
The code:
struct mbuf *em_pool;
static int em_pool_max = 50;
SYSCTL_INT(_hw, OID_AUTO, em_pool_max, CTLFLAG_RW,
&em_pool_max,0,"max size of mbuf pool");
static int em_pool_now;
SYSCTL_INT(_hw, OID_AUTO, em_pool_now, CTLFLAG_RD,
&em_pool_now,0,"Current size of mbuf pool");
... in em_get_buf() ..
if (em_pool) {
mp = em_pool;
em_pool = mp->m_nextpkt;
em_pool_now--;
goto have_it;
}
... in em_clean_transmit_interrupts() ...
if ((m = tx_buffer->m_head)) {
if (em_pool_now < em_pool_max &&
m->m_next == NULL &&
m->m_flags & M_EXT &&
M_WRITABLE(m) ) {
m->m_nextpkt = em_pool;
em_pool = m;
em_pool_now++;
} else
m_freem(m);
tx_buffer->m_head = NULL;
}
On Sat, Jun 29, 2002 at 02:53:03PM -0700, Luigi Rizzo wrote:
> Hi,
> during some experiments i was doing recently, i noticed that there
> is a significant improvement in the forwarding speed (especially
> at very high speeds) if we keep a small pool of mbuf+cluster
> ready for use. This is because most network drivers do something
> like this
>
> MGETHDR(m_new, M_DONTWAIT, MT_DATA);
> if (m_new == NULL)
> return(ENOBUFS);
>
> MCLGET(m_new, M_DONTWAIT);
> if (!(m_new->m_flags & M_EXT)) {
> m_freem(m_new);
> return(ENOBUFS);
> }
>
> when replenishing the receive buffers, and both macros are quite
> long even if there are available blocks in the free lists. We can
> store buffers of this form when/if they are released with some code
> like this:
>
> if (my_pool_count < my_pool_max && m->m_next == NULL &&
> (m->m_flags & M_EXT) && M_EXT_WRITABLE(m) ) {
> m->m_nextpkt = my_pool;
> m->m_data = ->m_ext.ext_buf;
> m->m_len = m->m_pkthdr.len = MCLBYTES;
> my_pool = m;
> my_pool_now++;
> } else {
> ... rest of m_freem() ...
> }
>
> and save a lot of overhead (we just need to reset m_data and
> m_len and m_pkthdr.len) when someone wants to allocate them.
>
> Is there interest in committing some code like this to
> mbuf.h and maybe uipc_mbuf*.c ?
>
> cheers
> luigi
>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020702001050.B2250>
