From owner-freebsd-current@FreeBSD.ORG Thu Nov 19 16:42:47 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B0C09106568B for ; Thu, 19 Nov 2009 16:42:47 +0000 (UTC) (envelope-from efinley.lists@gmail.com) Received: from mail-pw0-f44.google.com (mail-pw0-f44.google.com [209.85.160.44]) by mx1.freebsd.org (Postfix) with ESMTP id 663738FC08 for ; Thu, 19 Nov 2009 16:42:47 +0000 (UTC) Received: by pwj15 with SMTP id 15so1639657pwj.3 for ; Thu, 19 Nov 2009 08:42:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=0sbmjXUsm7bK5gmDqSDlnSpSZqPcHZZT2Qr8G2M/nPw=; b=tZEVqnPCA2tdoE1NInJV5MtmXhprflqN6hNWY3sTyiBv7QfH8st9+NV6mMBEGLFYtW 4nscgdsylA0Hjuvi2DnVTiqMnQcTiNxudOhBWtc0OJUJgvwp/BTcLpt21d4JtR88KGKc M0lUnJEOpr9APF4u5sKoo5ppVv+Z/y/up0gG4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=hBPlTRXGpUz8JZ+51awyY6wlyTCAe18+U7uU7f3BGeENZooZUXyn6h4nHVtJ+b5nFy wKZjkvESliG+SJwcN+oAea63PmE5hq2HpyDtfBRhKuyZPNR7G65Yfgp1TFosx6t93MxQ 9tP2+G+vQa644eChkRge1pCHcR2BAAvEe6Xsg= MIME-Version: 1.0 Received: by 10.143.27.31 with SMTP id e31mr21003wfj.173.1258648966861; Thu, 19 Nov 2009 08:42:46 -0800 (PST) In-Reply-To: References: <54e63c320911181807m4ddb770br1281d1163ae3cf5f@mail.gmail.com> Date: Thu, 19 Nov 2009 09:42:46 -0700 Message-ID: <54e63c320911190842n352cd860q460684376065cd3a@mail.gmail.com> From: Elliot Finley To: Robert Watson , freebsd-current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Re: 8.0-RC3 network performance regression X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Nov 2009 16:42:47 -0000 On Thu, Nov 19, 2009 at 2:11 AM, Robert Watson wrote: > > On Wed, 18 Nov 2009, Elliot Finley wrote: > > I have several boxes running 8.0-RC3 with pretty dismal network >> performance. I also have some 7.2 boxes with great performance. Using iperf >> I did some tests: >> >> server(8.0) <- client (8.0) == 420Mbps >> server(7.2) <- client (7.2) == 950Mbps >> server(7.2) <- client (8.0) == 920Mbps >> server(8.0) <- client (7.2) == 420Mbps >> >> so when the server is 7.2, I have good performance regardless of whether >> the client is 8.0 or 7.2. when the server is 8.0, I have poor performance >> regardless of whether the client is 8.0 or 7.2. >> >> Has anyone else noticed this? Am I missing something simple? >> > > I've generally not measured regressions along these lines, but TCP > performance can be quite sensitive to specific driver version and hardware > configuration. So far, I've generally measured significant TCP scalability > improvements in 8, and moderate raw TCP performance improvements over real > interfaces. On the other hand, I've seen decreased TCP performance on the > loopback due to scheduling interactions with ULE on some systems (but not > all -- disabling checksum generate/verify has improved loopback on other > systems). > > The first thing to establish is whether other similar benchmarks give the > same result, which might us to narrow the issue down a bit. Could you try > using netperf+netserver with the TCP_STREAM test and see if that differs > using the otherwise identical configuration? > > Could you compare the ifconfig link configuration of 7.2 and 8.0 to make > sure there's not a problem with the driver negotiating, for example, half > duplex instead of full duplex? Also confirm that the same blend ot > LRO/TSO/checksum offloading/etc is present. > > Could you do "procstat -at | grep ifname" (where ifname is your interface > name) and send that to me? > > Another thing to keep an eye of is interrupt rates and pin sharing, which > are both sensitive to driver change and ACPI changes. It wouldn't hurt to > compare vmstat -i rates not just on your network interface, but also on > other devices, to make sure there's not new aliasing. With a new USB stack > and plenty of other changes, additional driver code running when your NIC > interrupt fires would be highly measurable. > > Finally, two TCP tweaks to try: > > (1) Try disabling in-flight bandwidth estimation by setting > net.inet.tcp.inflight.enable to 0. This often hurts low-latency, > high-bandwidth local ethernet links, and is sensitive to many other > issues > including time-keeping. It may not be the "cause", but it's a useful > thing to try. > > (2) Try setting net.inet.tcp.read_locking to 0, which disables the > read-write > locking strategy on global TCP locks. This setting, when enabled, > significantly impoves TCP scalability when dealing with multiple NICs or > input queues, but is one of the non-trivial functional changes in TCP. Thanks for the reply. Here is some more info: netperf results: storage-price-3 root:~#>netperf -H 10.20.10.20 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.20.10.20 (10.20.10.20) port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 4194304 4194304 4194304 10.04 460.10 The interface on both boxes is em1. Both boxes (8.0RC3) have two 4-port PCIe NICs in them.Trying the two TCP tweaks didn't change anything. While running iperf I did the procstat and vmstat: SERVER: storage-price-2 root:~#>ifconfig em1 em1: flags=8843 metric 0 mtu 1500 options=19b ether 00:15:17:b2:31:3d inet 10.20.10.20 netmask 0xffffff00 broadcast 10.20.10.255 media: Ethernet autoselect (1000baseT ) status: active storage-price-2 root:~#>procstat -at | grep em1 0 100040 kernel em1 taskq 3 16 run - storage-price-2 root:~#>vmstat -i interrupt total rate irq14: ata0 22979 0 irq15: ata1 23157 0 irq16: aac0 uhci0* 1552 0 irq17: uhci2+ 37 0 irq18: ehci0 uhci+ 43 0 cpu0: timer 108455076 2000 irq257: em1 2039287 37 cpu2: timer 108446955 1999 cpu1: timer 108447018 1999 cpu3: timer 108447039 1999 cpu7: timer 108447061 1999 cpu5: timer 108447061 1999 cpu6: timer 108447054 1999 cpu4: timer 108447061 1999 Total 869671380 16037 CLIENT: storage-price-3 root:~#>ifconfig em1 em1: flags=8843 metric 0 mtu 1500 options=19b ether 00:15:17:b2:31:49 inet 10.20.10.30 netmask 0xffffff00 broadcast 10.20.10.255 media: Ethernet autoselect (1000baseT ) status: active storage-price-3 root:~#>procstat -at | grep em1 0 100040 kernel em1 taskq 3 16 run - storage-price-3 root:~#>vmstat -i interrupt total rate irq1: atkbd0 2 0 irq14: ata0 22501 0 irq15: ata1 22395 0 irq16: aac0 uhci0* 5091 0 irq17: uhci2+ 125 0 irq18: ehci0 uhci+ 43 0 cpu0: timer 108421132 1999 irq257: em1 1100465 20 cpu3: timer 108412973 1999 cpu1: timer 108412987 1999 cpu2: timer 108413010 1999 cpu7: timer 108413048 1999 cpu6: timer 108413048 1999 cpu5: timer 108413031 1999 cpu4: timer 108413045 1999 Total 868462896 16020 7.2 BOX: dns1 root:~#>ifconfig em0 em0: flags=8843 metric 0 mtu 1500 options=9b ether 00:13:72:5a:ff:48 inet X.Y.Z.7 netmask 0xffffffc0 broadcast X.Y.Z.63 media: Ethernet autoselect (1000baseTX ) status: active The 8.0RC3 boxes are being used for testing right now (production 2nd week of December). If you want access to them, that wouldn't be a problem. TIA Elliot