Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Nov 2008 02:40:29 -0800 (PST)
From:      Won De Erick <won.derick@yahoo.com>
To:        freebsd-net@freebsd.org
Subject:   Re: does freebsd support so called Scalable I/O on intel NIC ?
Message-ID:  <351447.99591.qm@web45806.mail.sp1.yahoo.com>

next in thread | raw e-mail | index | archive | help
Hello,

Regarding Scalable I/O on Intel NIC:

Using Intel Pro NIC (82571), I've downloaded the patches found on the following link:

http://people.yandex-team.ru/~wawa/

I compiled and applied w/ FreeBSD 7.1 Beta2, and made some changes on the default settings.

With net.isr.direct=1, I made some changes on kthreads(default=2) for em0 and em1's rx.

dev.em.0.rx_kthreads: 6
....
dev.em.1.rx_kthreads: 6

The result:

last pid:  1690;  load averages: 10.83,  8.92,  8.56              up 0+02:28:24  18:22:54
107 processes: 28 running, 61 sleeping, 18 waiting
CPU:  0.0% user,  0.0% nice, 74.0% system,  1.7% interrupt, 24.3% idle
Mem: 17M Active, 7040K Inact, 161M Wired, 76K Cache, 21M Buf, 31G Free
Swap: 4096M Total, 4096M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
   56 root        1  43    -     0K    16K CPU0   0  54:04 100.00% em1_rx_kthread_1
   55 root        1  43    -     0K    16K CPU13  d  53:37 100.00% em1_rx_kthread_0
 1280 root        1  43    -     0K    16K CPU3   3  52:23 100.00% em1_rx_kthread_3
 1279 root        1  43    -     0K    16K CPU7   7  51:57 100.00% em1_rx_kthread_2
 1347 root        1  43    -     0K    16K CPU2   2  45:51 100.00% em1_rx_kthread_5
 1346 root        1  43    -     0K    16K CPU5   5  45:40 100.00% em1_rx_kthread_4
   50 root        1 -68    -     0K    16K CPU12  c  24:17 100.00% em0_txcleaner
   11 root        1 171 ki31     0K    16K RUN    f 105:35 94.38% idle: cpu15
   25 root        1 171 ki31     0K    16K CPU1   1  76:32 81.40% idle: cpu1
 1282 root        1  43    -     0K    16K WAIT   6  92:14 76.07% em0_rx_kthread_2
   51 root        1  43    -     0K    16K CPU9   9  95:00 75.59% em0_rx_kthread_0
 1344 root        1  43    -     0K    16K CPU11  b  79:18 75.49% em0_rx_kthread_5
 1343 root        1  43    -     0K    16K WAIT   8  79:12 75.39% em0_rx_kthread_4
   52 root        1  43    -     0K    16K CPU14  e  95:00 74.37% em0_rx_kthread_1
 1283 root        1  43    -     0K    16K CPU10  a  92:24 68.65% em0_rx_kthread_3
   22 root        1 171 ki31     0K    16K CPU4   4  58:31 60.35% idle: cpu4
   54 root        1 -68    -     0K    16K WAIT   4  88:44 39.06% em1_txcleaner
   20 root        1 171 ki31     0K    16K RUN    6  88:32 32.67% idle: cpu6
   16 root        1 171 ki31     0K    16K RUN    a  85:10 31.49% idle: cpu10
   17 root        1 171 ki31     0K    16K RUN    9  76:45 28.96% idle: cpu9
   15 root        1 171 ki31     0K    16K RUN    b  92:25 28.86% idle: cpu11
   18 root        1 171 ki31     0K    16K RUN    8  91:58 28.66% idle: cpu8
   12 root        1 171 ki31     0K    16K RUN    e 104:36 28.08% idle: cpu14
   28 root        1 -32    -     0K    16K WAIT   1  74:01 20.75% swi4: clock sio
   23 root        1 171 ki31     0K    16K RUN    3  69:43  6.59% idle: cpu3
   26 root        1 171 ki31     0K    16K RUN    0  72:57  3.37% idle: cpu0
   13 root        1 171 ki31     0K    16K RUN    d  86:15  0.00% idle: cpu13
   24 root        1 171 ki31     0K    16K RUN    2  86:08  0.00% idle: cpu2
   14 root        1 171 ki31     0K    16K RUN    c  83:32  0.00% idle: cpu12
   19 root        1 171 ki31     0K    16K RUN    7  80:47  0.00% idle: cpu7
   21 root        1 171 ki31     0K    16K RUN    5  74:22  0.00% idle: cpu5
   27 root        1 -44    -     0K    16K WAIT   2   3:04  0.00% swi1: net

I am happy to see that the threads are distributed among the CPUs, but I observed that there were packets errors and drops on the LAN side (em1):

# netstat -I em1 -w 1 -d
            input          (em1)           output
   packets  errs      bytes    packets  errs      bytes colls drops
     32494   483   23083087      15681     0   23719154     0    82
     30547   330   23104447      16062     0   23077442     0    44

In addition to the above result. I noticed errors on the WAN side (em0), but without packet drops.

# netstat -I em0 -w 1 -d
            input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls drops
     19889   640   24144754      21307     0    8719922     0     0
     18071  2436   25966238      21088     0    8766995     0     0

Is there any tool that I can use to trace where the errors and drops are occurring or coming from [internally]?
I should want to see the specific process/task/threads that is causing this.

-----------------------------------------------------
Original Message from
Robert Watson <rwatson@xxxxxxxxxxx>
Date: Sun, 26 Oct 2008 13:43:01 +0000 (GMT)
________________________________
> On Fri, 24 Oct 2008, Kip Macy wrote:
> > It is simply a knob to adjust on all new server network cards. You could 
benefit from it on a predominantly UDP workload. I believe that tcp_input is 
still sufficiently serialized that it would not make sense for TCP 
workloads. 
> 
> In principle we can benefit on the basis that we drop the global lock fairly 
quickly for steady-state workloads (i.e., few SYN/FIN/RST packets), but there 
should be lots of contention on tcbinfo.
> 
> 
If anyone is interested in doing some benchmarks, I have some patches that 
should apply fairly easily againts 8.x or 7.x as of 7.1 to move to optimistic read-locking of the global lock for steady state packets, but once in a while 
we have to upgrade or drop and re-acquire to get an exclusive lock when it 
turns out something that looked like a steady state packet did require the 
global lock exclusively, such as the ACK to transitioning to or from 
established.
> 

I am interested to conduct a benchmark. Currently I am using FreebSD 7.1 Beta2 running on HPDL 585 w/ 16 CPUs. I am happy if you can send me the patch to do some benchmarks.


> 
I've not had a chance to do much benchmarking on them, and theorize that they 
probably help quite a lot for lots of steady-state connections, but as 
connection length gets shorter the optimistic assumption becomes less true and 
begins to hurt performance.
> 
> 
The long-term plan is to move to some more agressive decomposition of the 
tcbinfo lock, but I've not started on that yet as I'm waiting for the rwlock 
changes to settle, and need to evaluate the above tcbinfo rwlock patch.
> 
> 
Robert N M Watson
> 
Computer Laboratory
> 
University of Cambridge

Thanks,

Won



      




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?351447.99591.qm>