Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Mar 2010 20:41:11 +0400
From:      rihad <rihad@mail.ru>
To:        Jack Vogel <jfvogel@gmail.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: ixgbe input errors at high data rates
Message-ID:  <4B9A6EA7.3040603@mail.ru>
In-Reply-To: <2a41acea1003110942u717e2222hd984bd2859c3e477@mail.gmail.com>
References:  <4B99114E.7060909@mail.ru> <2a41acea1003110942u717e2222hd984bd2859c3e477@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Jack Vogel wrote:
> The 1.3.3 driver is two years old, and your OS is older. I would 
> respectfully suggest that you update to 8.0 where lots of effort was 
> put to make 10G hardware perform up to its capabilities.

OK, done source-upgrading kernel+world to 8.0-RELEASE-p2 with its stock 
ixgbe driver.
Things got much worse, with net traffic load of 2/3 from what it was 
before. Both interfaces' settings left at their default:

options=5bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO>

systat -ip shows around 120-130K input packet rate, and 1800-2000 
"fragmentation failed" errors per refresh. Is this an MTU issue? It's at 
default 1500.
systat -if sometimes shows weird numbers:
             ix1  in    630.404 Mb/s        637.332 Mb/s          502.670 GB
                  out  6826.954 Mb/s       6826.954 Mb/s           40.144 MB

             ix0  in    638.440 Mb/s        643.392 Mb/s          574.943 GB
                  out  6861.143 Mb/s       6861.143 Mb/s            5.811 KB

see the 6861 mbps stuff? which should be practically 0. also ix1 630 
mbps should be on output, not input. ix0/in and ix1/out differ in as 
much as 100-150 mbps sometimes.

Turning dummynet ("pipe tablearg", ~6000 table entries) improves things 
a bit: the data rate grows to what it should be and stays at that level 
a couple of minutes, but then it suddenly drops to 4/5 of that and stays 
there a couple of minutes.

There's a new option for dummynet in "man ipfw": burst
      burst size
              If the data to be sent exceeds the pipe's bandwidth limit (and
              the pipe was previously idle), up to size bytes of data are
              allowed to bypass the dummynet scheduler, and will be sent as
              fast as the physical link allows.  Any additional data will be
              transmitted at the rate specified by the pipe bandwidth.  The
              burst size depends on how long the pipe has been idle; the 
effec-
              tive burst size is calculated as follows: MAX( size , bw *
              pipe_idle_time).

Is this worthwhile?

Things have gotten much worse.  Please help.

> Similarly, I have done lots of work in two years to the ixgbe driver,
> I would even suggest that once you have 8 installed you get the
> driver from HEAD.
> 


> Regards,
> 
> Jack
> 
> On Thu, Mar 11, 2010 at 7:50 AM, rihad <rihad@mail.ru 
> <mailto:rihad@mail.ru>> wrote:
> 
> Hi, our Intel 10 GigE cards are finally here, identified as <Intel(R)
>  PRO/10GbE PCI-Express Network Driver, Version - 1.3.3> with the 
> driver ixgbe-1.3.3 off the CD-ROM. One card is used for input, the 
> other for output, doing traffic limiting (dummynet) and accounting in
>  between. At data rates of about 700-1000 mbps netstat -i shows many 
> Input errors on ix0 at a rate of 10-20K per second :(
> 
> top -HS: CPU:  1.3% user,  0.0% nice, 25.2% system, 14.1% interrupt, 
> 59.3% idle Mem: 1047M Active, 2058M Inact, 466M Wired, 126M Cache, 
> 214M Buf, 239M Free Swap: 2048M Total, 2048M Free
> 
> PID USERNAME   PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
>  32 root       -68    -     0K    16K CPU3   3 460:56 100.00% irq258:
>  ix0 33 root       -68    -     0K    16K CPU7   7 143:14 100.00% ix0
>  rxq 13 root       171 ki31     0K    16K RUN    5 574:39 93.65%
> idle: cpu5 12 root       171 ki31     0K    16K RUN    6 507:08
> 88.33% idle: cpu6 14 root       171 ki31     0K    16K CPU4   4
> 424:04 80.37% idle: cpu4 18 root       171 ki31     0K    16K CPU0
> 0 395:34 75.00% idle: cpu0 16 root       171 ki31     0K    16K RUN 2
> 433:10 70.21% idle: cpu2 700 root       -68    -     0K    16K - 2
> 292:19 56.64% dummynet 17 root       171 ki31     0K    16K CPU1 1
> 399:02 50.39% idle: cpu1 37 root       -68    -     0K    16K CPU1 1
> 196:19 39.50% ix1 rxq 11 root       171 ki31     0K    16K RUN    7 
> 510:39 14.79% idle: cpu7 36 root       -68    -     0K    16K WAIT 5
> 36:36  8.64% irq260: ix1 19 root       -32    -     0K    16K CPU6 6
> 36:52  5.08% swi4: clock sio
> 
> 
> Turning dummynet off (by short-circuiting the IPFW rule "allow ip 
> from any to any" before the "pipe tablearg") doesn't eliminate the 
> input errors. Turning ip.fastfowarding off (see below) doesn't help 
> either (why would it), only this time "swi" is chewing up the CPU 
> time instead of "irq". Are we hitting the CPU core limits here? It's
>  a dual cpu quad-core Intel(R) Xeon(R) CPU E5410 @ 2.33GHz (Dell 
> PowerEdge 2950). Shouldn't this $2.5K expensive card have 
> decently-sized hardware buffers to prevent any overruns?
> 
> Some custom settings: kern.hz=4000 net.inet.ip.fastforwarding=1 
> kern.ipc.nmbclusters=111111 net.inet.ip.dummynet.io_fast=1 
> net.isr.direct=0 net.inet.ip.intr_queue_maxlen=5000 
> hw.intr_storm_threshold=8000 #as suggested by the ixgbe-1.3.3 docs
> 
> FreeBSD 7.1 kernel built with DEVICE_POLLING, even though it isn't 
> used. Should I nonetheless recompile without it? I heard the mere 
> existence of DEVICE_POLLING affects some cards' performance.
> 
> Thanks for any tips. _______________________________________________
>  freebsd-net@freebsd.org <mailto:freebsd-net@freebsd.org> mailing
> list http://lists.freebsd.org/mailman/listinfo/freebsd-net To 
> unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org 
> <mailto:freebsd-net-unsubscribe@freebsd.org>"
> 
> 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B9A6EA7.3040603>