Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Aug 2013 18:06:36 +0200
From:      Stefan Esser <se@freebsd.org>
To:        Harald Schmalzbauer <h.schmalzbauer@omnilan.de>
Cc:        FreeBSD Stable Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: if_em, legacy nic and GbE saturation
Message-ID:  <521B7D0C.2010109@freebsd.org>
In-Reply-To: <521B1FD4.4050702@omnilan.de>
References:  <521AFE7E.2040705@omnilan.de> <CAJ-VmokRNbDXC1Er6pxOWSLJs=DvCmbCfcViZ3K4Twxb9V5BKw@mail.gmail.com> <521B1FD4.4050702@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Am 26.08.2013 11:28, schrieb Harald Schmalzbauer:
> Bezüglich Adrian Chadd's Nachricht vom 26.08.2013 10:34
> (localtime):
>> Hi,
>> 
>> There's bus limits on how much data you can push over a PCI bus.
>> You can look around online to see what 32/64 bit, 33/66MHz PCI
>> throughput estimates are.
>> 
>> It changes massively if you use small versus large frames as
>> well.
>> 
>> The last time I tried it i couldn't hit gige on PCI; I only
>> managed to get to around 350mbit doing TCP tests.
> 
> Thanks, I'm roughly aware about the PCI bus limit, but I guess it
> should be good for almost GbE: 33*10^6*32=1056, so if one considers
> overhead and other bus-blocking things (nothing of significance is
> active on the PCI bus in this case), I'd expect at least 800Mbis/s,
> which is what I get with jumbo frames.

But PCI bus throughput might be much lower than expected:

- The arbitration overhead is quite high, in the order of 0.2 to 0.3us.

- Depending on device capabilities and chip-set configuration and
  features there may be many more arbitration phases than one might
  expect.

- A cache line flush is requested for data held in the CPU, unless the
  bus-master uses special transfer commands to indicate that the full
  cache line will be invalidated within the requested transfer.

These overheads combined may reduce the effective PCI throughput to a
fraction of the nominal performance (1/3 to 1/4 for bursts of 16 bytes).

The "minimum grant" value is the minimum burst length the device wants
(to avoid a buffer underrun/overrun due to too low effective bandwidth)
the "maximum latency" corresponds to the number of PCI clocks the
device is willing to wait for the bus to be granted (to avoid buffer
underrun/overrun while waiting to get access to the bus granted). The
maximum latency value is useful to calculate the maximum arbitration
unit for which no device is stalled longer than allowed by MAXLAT.

MINGNT and MAXLAT of a device can be displayed with pciconf:

# pciconf -r -b pci0:1:0 0x3e:0x3f (e.g., for bus 0 device 1 function 0)

The PCI bus will be "lost" whenever another device gets access to
the bus, whether CPU or another PCI (or PCIe) device.

Especially when simultanously sending and receiving packets with
two Ethernet controllers, bus arbitration will occur for every 16
to 32 transfers (depending on bus arbiter settings and programmed
MINGNT).

Regards, STefan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?521B7D0C.2010109>