FreeBSD Mail Archives

Date:      Thu, 4 Jul 2002 16:59:02 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Bosko Milekic <bmilekic@unixdaemons.com>
Cc:        "Kenneth D. Merry" <ken@kdm.org>, current@FreeBSD.ORG, net@FreeBSD.ORG
Subject:   virtually contig jumbo mbufs (was Re: new zero copy sockets snapshot)
Message-ID:  <15652.46870.463359.853754@grasshopper.cs.duke.edu>
In-Reply-To: <20020620134723.A22954@unixdaemons.com>
References:  <20020618223635.A98350@panzer.kdm.org> <xzpelf3ida1.fsf@flood.ping.uio.no> <20020619090046.A2063@panzer.kdm.org> <20020619120641.A18434@unixdaemons.com> <15633.17238.109126.952673@grasshopper.cs.duke.edu> <20020619233721.A30669@unixdaemons.com> <15633.62357.79381.405511@grasshopper.cs.duke.edu> <20020620114511.A22413@unixdaemons.com> <15634.534.696063.241224@grasshopper.cs.duke.edu> <20020620134723.A22954@unixdaemons.com>


Bosko Milekic writes:
 > > One question.  I've observed some really anomolous behaviour under
 > > -stable with my Myricom GM driver (2Gb/s + 2Gb/s link speed, Dual 1GHz
 > > pIII).  When I use 4K mbufs for receives, the best speed I see is
 > > about 1300Mb/sec.  However, if I use private 9K physically contiguous
 > > buffers I see 1850Mb/sec (iperf TCP).
 > > 
 > > The obvious conclusion is that there's a lot of overhead in setting up
 > > the DMA engines, but that's not the case; we have a fairly quick chain
 > > dma engine.  I've provided a "control" by breaking my contiguous
 > > buffers down into 4K chunks so that I do the same number of DMAs in
 > > both cases and I still see ~1850 Mb/sec for the 9K buffers.  
 > > 
 > > A coworker suggested that the problem was that when doing copyouts to
 > > userspace, the PIII was doing speculative reads and loading the cache
 > > with the next page.  However, we then start copying from a totally
 > > different address using discontigous buffers, so we effectively take
 > > 2x the number of cache misses we'd need to.  Does that sound
 > > reasonable to you?  I need to try malloc'ing virtually contigous and
 > > physically discontigous buffers & see if I get the same (good)
 > > performance...
 > 
 >   I believe that the Intel chips do "virtual page caching" and that the
 > logic that does the virtual -> physical address translation sits between
 > the L2 cache and RAM.  If that is indeed the case, then your idea of
 > testing with virtually contiguous pages is a good one.
 >   Unfortunately, I don't know if the PIII is doing speculative
 > cache-loads, but it could very well be the case.  If it is and if in
 > fact the chip does caching based on virtual addresses, then providing it
 > with virtually contiguous address space may yield better results.  If
 > you try this, please let me know.  I'm extremely interested in seeing
 > the results!

contigmalloc'ed private jumbo mbufs (same as bge, if_ti, etc):

% iperf -c ugly-my -l 32k -fm
------------------------------------------------------------
Client connecting to ugly-my, TCP port 5001
TCP window size:  0.2 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.3 port 1031 connected with 192.168.1.4 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  2137 MBytes  1792 Mbits/sec



malloc'ed, physically discontigous private jumbo mbufs:

% iperf -c ugly-my -l 32k -fm
------------------------------------------------------------
Client connecting to ugly-my, TCP port 5001
TCP window size:  0.2 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.3 port 1029 connected with 192.168.1.4 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  2131 MBytes  1788 Mbits/sec


So I'd be willing to believe that the 4Mb/sec loss was due to 
the extra overhead of setting up 2 additional DMAs. 


So it looks like this idea would work. 


Drew

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15652.46870.463359.853754>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation