Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Mar 2003 09:44:16 +0200
From:      "Petri Helenius" <pete@he.iki.fi>
To:        "Terry Lambert" <tlambert2@mindspring.com>
Cc:        <freebsd-current@FreeBSD.ORG>
Subject:   Re: mbuf cache
Message-ID:  <048601c2ec59$0696dd30$932a40c1@PHE>
References:  <0ded01c2e295$cbef0940$932a40c1@PHE> <20030304164449.A10136@unixdaemons.com> <0e1b01c2e29c$d1fefdc0$932a40c1@PHE> <20030304173809.A10373@unixdaemons.com> <0e2b01c2e2a3$96fd3b40$932a40c1@PHE> <20030304182133.A10561@unixdaemons.com> <0e3701c2e2a7$aaa2b180$932a40c1@PHE> <20030304190851.A10853@unixdaemons.com> <001201c2e2ee$54eedfb0$932a40c1@PHE> <20030307093736.A18611@unixdaemons.com> <008101c2e4ba$53d875a0$932a40c1@PHE> <3E68ECBF.E7648DE8@mindspring.com> <3E70813B.7040504@he.iki.fi> <3E750D52.FFA28DA2@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> You can get to this same point in -CURRENT, if you are using up to
> date sources, by enabling direct dispatch, which disables NETISR.
> This will help somewhat more than polling, since it will remove the
> normal timer latency between receipt of a packet, and processing of
> the packet through the networks stack.  This should reduce overall
> pool retention time for individual mbufs that don't end up on a
> socket so_rcv queue.  Because interrupts on the card are not
> acknowledged until the code runs to completion, this also tends to
> requlate interupt load.
> 
My source seems to be a few days older than when this stuff went 
in, will update and try it out.

> This also has the desirable side effect that stack processing will
> occur on the same CPU as the interrupt processing occurred.  This
> avoids inter-CPU memory bus arbitration cycles, and ensures that
> you won't engage in a lot of unnecessary L1 cache busting.  Hence
> I prefer this method to polling.
> 
Anywhere I could read up on the associated overhead and how the whole
stuff works out in the worst case where data is DMAd into memory, 
read up to CPU1 and then to CPU2 and then discarded and if there would be
any roads that can be taken to optimize this.  
> 
> You will get much better load capacity scaling out of two cheaper
> boxes, if you implement correctly, IMO.

Synchronization of the unformatted data can probably never get as good as 
it gets if you optimize the system for your case. But I agree it should be better
than it is now, however it does not really seem to get any better.
(unless you consider the EV7 and Opteron approaches better than the current
Intel approach)

Pete


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?048601c2ec59$0696dd30$932a40c1>