From owner-freebsd-net@FreeBSD.ORG Mon Nov 17 10:40:29 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0BF51065688 for ; Mon, 17 Nov 2008 10:40:29 +0000 (UTC) (envelope-from won.derick@yahoo.com) Received: from n77.bullet.mail.sp1.yahoo.com (n77.bullet.mail.sp1.yahoo.com [98.136.44.45]) by mx1.freebsd.org (Postfix) with SMTP id B4D848FC14 for ; Mon, 17 Nov 2008 10:40:29 +0000 (UTC) (envelope-from won.derick@yahoo.com) Received: from [69.147.65.173] by n77.bullet.mail.sp1.yahoo.com with NNFMP; 17 Nov 2008 10:40:29 -0000 Received: from [69.147.84.107] by t15.bullet.mail.sp1.yahoo.com with NNFMP; 17 Nov 2008 10:40:29 -0000 Received: from [127.0.0.1] by omp202.mail.sp1.yahoo.com with NNFMP; 17 Nov 2008 10:40:29 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 448918.70808.bm@omp202.mail.sp1.yahoo.com Received: (qmail 99828 invoked by uid 60001); 17 Nov 2008 10:40:29 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=P/eKVVIDUsDZjncSO03Rc+FO3tTSsh/zPQpwD7pQZjrIkO+XBLklYlt20XCwJ6i/Ofiq99Hsmr7by3pOTr80lXLXH87E7fQ2cDNv9AM1fj9U7ACI/wUFrmrmwSmVpuBihynmsuP2C+H26o8U/Bbiov16XRB8Pj8zUnl5SwL0ryU=; X-YMail-OSG: A4Xh5rAVM1mrTPW_.eS2CP83MF8axRoksYfuK798RIcYe7YQuYaZ7AW5fBRFMT16q9QtLOyHncTeQ8vuWeGcrw7uGSjVP5UX5Y4HMyMeJKSRKNwSs4j9t4vhYYH8.5r22huujeF47Y_eCZdgk741YAUw9Ac- Received: from [58.71.34.137] by web45806.mail.sp1.yahoo.com via HTTP; Mon, 17 Nov 2008 02:40:29 PST X-Mailer: YahooMailRC/1155.29 YahooMailWebService/0.7.260.1 Date: Mon, 17 Nov 2008 02:40:29 -0800 (PST) From: Won De Erick To: freebsd-net@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Message-ID: <351447.99591.qm@web45806.mail.sp1.yahoo.com> X-Mailman-Approved-At: Mon, 17 Nov 2008 12:28:37 +0000 Subject: Re: does freebsd support so called Scalable I/O on intel NIC ? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 10:40:29 -0000 Hello, Regarding Scalable I/O on Intel NIC: Using Intel Pro NIC (82571), I've downloaded the patches found on the following link: http://people.yandex-team.ru/~wawa/ I compiled and applied w/ FreeBSD 7.1 Beta2, and made some changes on the default settings. With net.isr.direct=1, I made some changes on kthreads(default=2) for em0 and em1's rx. dev.em.0.rx_kthreads: 6 .... dev.em.1.rx_kthreads: 6 The result: last pid: 1690; load averages: 10.83, 8.92, 8.56 up 0+02:28:24 18:22:54 107 processes: 28 running, 61 sleeping, 18 waiting CPU: 0.0% user, 0.0% nice, 74.0% system, 1.7% interrupt, 24.3% idle Mem: 17M Active, 7040K Inact, 161M Wired, 76K Cache, 21M Buf, 31G Free Swap: 4096M Total, 4096M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 56 root 1 43 - 0K 16K CPU0 0 54:04 100.00% em1_rx_kthread_1 55 root 1 43 - 0K 16K CPU13 d 53:37 100.00% em1_rx_kthread_0 1280 root 1 43 - 0K 16K CPU3 3 52:23 100.00% em1_rx_kthread_3 1279 root 1 43 - 0K 16K CPU7 7 51:57 100.00% em1_rx_kthread_2 1347 root 1 43 - 0K 16K CPU2 2 45:51 100.00% em1_rx_kthread_5 1346 root 1 43 - 0K 16K CPU5 5 45:40 100.00% em1_rx_kthread_4 50 root 1 -68 - 0K 16K CPU12 c 24:17 100.00% em0_txcleaner 11 root 1 171 ki31 0K 16K RUN f 105:35 94.38% idle: cpu15 25 root 1 171 ki31 0K 16K CPU1 1 76:32 81.40% idle: cpu1 1282 root 1 43 - 0K 16K WAIT 6 92:14 76.07% em0_rx_kthread_2 51 root 1 43 - 0K 16K CPU9 9 95:00 75.59% em0_rx_kthread_0 1344 root 1 43 - 0K 16K CPU11 b 79:18 75.49% em0_rx_kthread_5 1343 root 1 43 - 0K 16K WAIT 8 79:12 75.39% em0_rx_kthread_4 52 root 1 43 - 0K 16K CPU14 e 95:00 74.37% em0_rx_kthread_1 1283 root 1 43 - 0K 16K CPU10 a 92:24 68.65% em0_rx_kthread_3 22 root 1 171 ki31 0K 16K CPU4 4 58:31 60.35% idle: cpu4 54 root 1 -68 - 0K 16K WAIT 4 88:44 39.06% em1_txcleaner 20 root 1 171 ki31 0K 16K RUN 6 88:32 32.67% idle: cpu6 16 root 1 171 ki31 0K 16K RUN a 85:10 31.49% idle: cpu10 17 root 1 171 ki31 0K 16K RUN 9 76:45 28.96% idle: cpu9 15 root 1 171 ki31 0K 16K RUN b 92:25 28.86% idle: cpu11 18 root 1 171 ki31 0K 16K RUN 8 91:58 28.66% idle: cpu8 12 root 1 171 ki31 0K 16K RUN e 104:36 28.08% idle: cpu14 28 root 1 -32 - 0K 16K WAIT 1 74:01 20.75% swi4: clock sio 23 root 1 171 ki31 0K 16K RUN 3 69:43 6.59% idle: cpu3 26 root 1 171 ki31 0K 16K RUN 0 72:57 3.37% idle: cpu0 13 root 1 171 ki31 0K 16K RUN d 86:15 0.00% idle: cpu13 24 root 1 171 ki31 0K 16K RUN 2 86:08 0.00% idle: cpu2 14 root 1 171 ki31 0K 16K RUN c 83:32 0.00% idle: cpu12 19 root 1 171 ki31 0K 16K RUN 7 80:47 0.00% idle: cpu7 21 root 1 171 ki31 0K 16K RUN 5 74:22 0.00% idle: cpu5 27 root 1 -44 - 0K 16K WAIT 2 3:04 0.00% swi1: net I am happy to see that the threads are distributed among the CPUs, but I observed that there were packets errors and drops on the LAN side (em1): # netstat -I em1 -w 1 -d input (em1) output packets errs bytes packets errs bytes colls drops 32494 483 23083087 15681 0 23719154 0 82 30547 330 23104447 16062 0 23077442 0 44 In addition to the above result. I noticed errors on the WAN side (em0), but without packet drops. # netstat -I em0 -w 1 -d input (em0) output packets errs bytes packets errs bytes colls drops 19889 640 24144754 21307 0 8719922 0 0 18071 2436 25966238 21088 0 8766995 0 0 Is there any tool that I can use to trace where the errors and drops are occurring or coming from [internally]? I should want to see the specific process/task/threads that is causing this. ----------------------------------------------------- Original Message from Robert Watson Date: Sun, 26 Oct 2008 13:43:01 +0000 (GMT) ________________________________ > On Fri, 24 Oct 2008, Kip Macy wrote: > > It is simply a knob to adjust on all new server network cards. You could benefit from it on a predominantly UDP workload. I believe that tcp_input is still sufficiently serialized that it would not make sense for TCP workloads. > > In principle we can benefit on the basis that we drop the global lock fairly quickly for steady-state workloads (i.e., few SYN/FIN/RST packets), but there should be lots of contention on tcbinfo. > > If anyone is interested in doing some benchmarks, I have some patches that should apply fairly easily againts 8.x or 7.x as of 7.1 to move to optimistic read-locking of the global lock for steady state packets, but once in a while we have to upgrade or drop and re-acquire to get an exclusive lock when it turns out something that looked like a steady state packet did require the global lock exclusively, such as the ACK to transitioning to or from established. > I am interested to conduct a benchmark. Currently I am using FreebSD 7.1 Beta2 running on HPDL 585 w/ 16 CPUs. I am happy if you can send me the patch to do some benchmarks. > I've not had a chance to do much benchmarking on them, and theorize that they probably help quite a lot for lots of steady-state connections, but as connection length gets shorter the optimistic assumption becomes less true and begins to hurt performance. > > The long-term plan is to move to some more agressive decomposition of the tcbinfo lock, but I've not started on that yet as I'm waiting for the rwlock changes to settle, and need to evaluate the above tcbinfo rwlock patch. > > Robert N M Watson > Computer Laboratory > University of Cambridge Thanks, Won