Date: Mon, 26 Aug 2013 18:06:36 +0200 From: Stefan Esser <se@freebsd.org> To: Harald Schmalzbauer <h.schmalzbauer@omnilan.de> Cc: FreeBSD Stable Mailing List <freebsd-stable@freebsd.org> Subject: Re: if_em, legacy nic and GbE saturation Message-ID: <521B7D0C.2010109@freebsd.org> In-Reply-To: <521B1FD4.4050702@omnilan.de> References: <521AFE7E.2040705@omnilan.de> <CAJ-VmokRNbDXC1Er6pxOWSLJs=DvCmbCfcViZ3K4Twxb9V5BKw@mail.gmail.com> <521B1FD4.4050702@omnilan.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Am 26.08.2013 11:28, schrieb Harald Schmalzbauer: > Bezüglich Adrian Chadd's Nachricht vom 26.08.2013 10:34 > (localtime): >> Hi, >> >> There's bus limits on how much data you can push over a PCI bus. >> You can look around online to see what 32/64 bit, 33/66MHz PCI >> throughput estimates are. >> >> It changes massively if you use small versus large frames as >> well. >> >> The last time I tried it i couldn't hit gige on PCI; I only >> managed to get to around 350mbit doing TCP tests. > > Thanks, I'm roughly aware about the PCI bus limit, but I guess it > should be good for almost GbE: 33*10^6*32=1056, so if one considers > overhead and other bus-blocking things (nothing of significance is > active on the PCI bus in this case), I'd expect at least 800Mbis/s, > which is what I get with jumbo frames. But PCI bus throughput might be much lower than expected: - The arbitration overhead is quite high, in the order of 0.2 to 0.3us. - Depending on device capabilities and chip-set configuration and features there may be many more arbitration phases than one might expect. - A cache line flush is requested for data held in the CPU, unless the bus-master uses special transfer commands to indicate that the full cache line will be invalidated within the requested transfer. These overheads combined may reduce the effective PCI throughput to a fraction of the nominal performance (1/3 to 1/4 for bursts of 16 bytes). The "minimum grant" value is the minimum burst length the device wants (to avoid a buffer underrun/overrun due to too low effective bandwidth) the "maximum latency" corresponds to the number of PCI clocks the device is willing to wait for the bus to be granted (to avoid buffer underrun/overrun while waiting to get access to the bus granted). The maximum latency value is useful to calculate the maximum arbitration unit for which no device is stalled longer than allowed by MAXLAT. MINGNT and MAXLAT of a device can be displayed with pciconf: # pciconf -r -b pci0:1:0 0x3e:0x3f (e.g., for bus 0 device 1 function 0) The PCI bus will be "lost" whenever another device gets access to the bus, whether CPU or another PCI (or PCIe) device. Especially when simultanously sending and receiving packets with two Ethernet controllers, bus arbitration will occur for every 16 to 32 transfers (depending on bus arbiter settings and programmed MINGNT). Regards, STefan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?521B7D0C.2010109>