From owner-freebsd-net@FreeBSD.ORG Sat Oct 22 09:01:28 2005 Return-Path: X-Original-To: net@FreeBSD.org Delivered-To: freebsd-net@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8D30416A41F; Sat, 22 Oct 2005 09:01:28 +0000 (GMT) (envelope-from mv@roq.com) Received: from p4.roq.com (ns1.ecoms.com [207.44.130.137]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2092743D49; Sat, 22 Oct 2005 09:01:27 +0000 (GMT) (envelope-from mv@roq.com) Received: from p4.roq.com (localhost.roq.com [127.0.0.1]) by p4.roq.com (Postfix) with ESMTP id 0DCEC4D0AA; Sat, 22 Oct 2005 09:01:46 +0000 (GMT) Received: from [192.168.0.3] (ppp157-158.static.internode.on.net [150.101.157.158]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by p4.roq.com (Postfix) with ESMTP id 347E74CFE1; Sat, 22 Oct 2005 09:01:43 +0000 (GMT) Message-ID: <4359FFE3.7060001@roq.com> Date: Sat, 22 Oct 2005 19:01:23 +1000 From: Michael VInce User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.7.12) Gecko/20051019 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Gleb Smirnoff References: <20051020140200.GL59364@cell.sick.ru> In-Reply-To: <20051020140200.GL59364@cell.sick.ru> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV using ClamSMTP Cc: net@FreeBSD.org Subject: Re: em(4) patch for test X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Oct 2005 09:01:28 -0000 Gleb Smirnoff wrote: > Colleagues, > > since the if_em problem was taken as a late showstopper for 6.0-RELEASE, >I am asking you to help with testing of the fixes made in HEAD. > > Does your em(4) interface wedge for some time? > Do you see a lot of errors in 'netstat -i' output? Does these errors > increase not monotonously but they have peaks? > >If the answer is yes, then the attached patch is likely to fix your problem. >If the answer is no, then you are still encouraged to help with testing >and install the patch to check that no regressions are introduced. If you >skip this, then you may encounter regressions after release, so you have >been warned. > > So, in short: please test! Thanks in advance! > >The patch is against fresh RELENG_6. > > Here are some results with my testing A-B-C network. This round of tests was largely focused on my gateway (B) which has 4 em built-in gigabit ether devices with polling and packet filter enabled. Server A and B are Dells 1850 6.0RC1 AMD64, Server C Dell 2850 6.0RC1 i386 I have pushed through many gigs of traffic through B's 2 interfaces using netperf and 'fetch' tests of large files. I have yet to see any errors before patching or after patching on the gateway. But after I finished testing I checked netstat -i and I have errors on my client (A) and server (B) machines which appear to be caused by earlier testing I was doing using Apache benchmark 'ab' so I am going to start testing them. Using the netperf and 'fetch' tests have appeared to never budge any Oerrs or Ierrs stats on any of the machines. After patch. B> netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll em0* 1500 00:0e:0c:f3:d4:f2 0 0 0 0 0 em1* 1500 00:0e:0c:f3:d4:f3 0 0 0 0 0 em2 1500 00:14:22:f5:ff:4b 39887301 0 24145946 0 0 em3 1500 00:14:22:f5:ff:4c 24144573 0 39885777 0 0 Before patch . B> netstat -i | grep "Link" em0* 1500 00:0e:0c:f3:d4:f2 0 0 0 0 0 em1* 1500 00:0e:0c:f3:d4:f3 0 0 0 0 0 em2 1500 00:14:22:f5:ff:4b 67671693 0 51036449 0 0 em3 1500 00:14:22:f5:ff:4c 49964821 0 65868324 0 0 Client test machine (A) and server machine (C) yet to be patched does have Ierrs errors but not many, the other thing to note that doing the benchmarks netperf and 'fetct' tests listed below (done repeatedly) never made these error stats budge at all. It turns out it was caused by some tests I was doing on these machines a day earlier using apache 'ab' tests which have been really beating up the servers quite well, this being 'A> ab -k -n 19000 -c 1500 http://server-c/133kbyte_file' . I will post some results after this email. Using this test. A> netstat -i | egrep 'em2.*Link|Name' Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll em2 1500 00:14:ff:15:ff:8e 225763513 5025 311375238 0 0 C> netstat -i | egrep 'em0.*Link|Name' Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll em0 1500 00:14:ff:12:4c:03 347060671 13317 251444959 0 0 Tests used and benchmark results using the netperf and fetch tests. Interestingly it appears to be benchmarking a tiny bit faster after patching, I am I believe its just been luck of the draw testing. A> fetch -o - > /dev/null http://server-C/file1gig.iso A> /usr/local/netperf/netperf -l 60 -H server-C -t TCP_STREAM -i 10,2 -I 99,5 -- -m 4096 -s 57344 -S 57344 Before patch Fetch test: 84 MBps Netperf Elapsed Throughput - 10^6bits/sec: 905.77 After patch Fetch test: 85 MBps Netperf Elapsed Throughput - 10^6bits/sec: 909.60 Before patch B> netstat -s tcp: 1070840 packets sent 55362 data packets (7666434 bytes) 183 data packets (189863 bytes) retransmitted 3 data packets unnecessarily retransmitted 0 resends initiated by MTU discovery 733560 ack-only packets (178 delayed) 0 URG only packets 0 window probe packets 281674 window update packets 62 control packets 1791219 packets received 53340 acks (for 7668053 bytes) 512 duplicate acks 0 acks for unsent data 1738233 packets (2511401694 bytes) received in-sequence 102 completely duplicate packets (83167 bytes) 0 old duplicate packets 9 packets with some dup. data (5184 bytes duped) 564 out-of-order packets (610315 bytes) 1 packet (0 bytes) of data after window 0 window probes 186 window update packets 0 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 13 connection requests 32 connection accepts 0 bad connection attempts 0 listen queue overflows 0 ignored RSTs in the windows 40 connections established (including accepts) 65 connections closed (including 9 drops) 29 connections updated cached RTT on close 29 connections updated cached RTT variance on close 10 connections updated cached ssthresh on close 5 embryonic connections dropped 53197 segments updated rtt (of 52242 attempts) 113 retransmit timeouts 0 connections dropped by rexmit timeout 0 persist timeouts 0 connections dropped by persist timeout 32 keepalive timeouts 29 keepalive probes sent 3 connections dropped by keepalive 434 correct ACK header predictions 1735955 correct data packet header predictions 35 syncache entries added 11 retransmitted 0 dupsyn 0 dropped 32 completed 0 bucket overflow 0 cache overflow 0 reset 3 stale 0 aborted 0 badack 0 unreach 0 zone failures 0 cookies sent 0 cookies received 42 SACK recovery episodes 0 segment rexmits in SACK recovery episodes 0 byte rexmits in SACK recovery episodes 945 SACK options (SACK blocks) received 666 SACK options (SACK blocks) sent 0 SACK scoreboard overflow udp: 293 datagrams received 0 with incomplete header 0 with bad data length field 0 with bad checksum 0 with no checksum 9 dropped due to no socket 0 broadcast/multicast datagrams dropped due to no socket 0 dropped due to full socket buffers 0 not for hashed pcb 284 delivered 293 datagrams output ip: 117631875 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with ip length > max ip packet size 0 with header length < data size 0 with data length < header length 0 with bad options 0 with incorrect version number 0 fragments received 0 fragments dropped (dup or out of space) 0 fragments dropped after timeout 0 packets reassembled ok 1791512 packets for this host 5 packets for unknown/unsupported protocol 115833063 packets forwarded (115833063 packets fast forwarded) 0 packets not forwardable 0 packets received for unknown multicast group 0 redirects sent 1071192 packets sent from this host 0 packets sent with fabricated ip header 0 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 fragments created 0 datagrams that can't be fragmented 0 tunneling packets that can't find gif 0 datagrams with bad address in header Netstat -s results: After patch netstat -s tcp: 3262 packets sent 2299 data packets (2248857 bytes) 215 data packets (239291 bytes) retransmitted 0 data packets unnecessarily retransmitted 0 resends initiated by MTU discovery 637 ack-only packets (123 delayed) 0 URG only packets 0 window probe packets 106 window update packets 5 control packets 3046 packets received 1409 acks (for 2253418 bytes) 291 duplicate acks 0 acks for unsent data 1251 packets (877028 bytes) received in-sequence 81 completely duplicate packets (60076 bytes) 2 old duplicate packets 10 packets with some dup. data (7621 bytes duped) 475 out-of-order packets (505701 bytes) 0 packets (0 bytes) of data after window 0 window probes 135 window update packets 0 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 3 connection requests 1 connection accept 0 bad connection attempts 0 listen queue overflows 0 ignored RSTs in the windows 4 connections established (including accepts) 4 connections closed (including 0 drops) 1 connection updated cached RTT on close 1 connection updated cached RTT variance on close 1 connection updated cached ssthresh on close 0 embryonic connections dropped 1409 segments updated rtt (of 1211 attempts) 84 retransmit timeouts 0 connections dropped by rexmit timeout 0 persist timeouts 0 connections dropped by persist timeout 0 keepalive timeouts 0 keepalive probes sent 0 connections dropped by keepalive 65 correct ACK header predictions 595 correct data packet header predictions 1 syncache entrie added 0 retransmitted 0 dupsyn 0 dropped 1 completed 0 bucket overflow 0 cache overflow 0 reset 0 stale 0 aborted 0 badack 0 unreach 0 zone failures 0 cookies sent 0 cookies received 33 SACK recovery episodes 0 segment rexmits in SACK recovery episodes 0 byte rexmits in SACK recovery episodes 623 SACK options (SACK blocks) received 663 SACK options (SACK blocks) sent 0 SACK scoreboard overflow udp: 33 datagrams received 0 with incomplete header 0 with bad data length field 0 with bad checksum 0 with no checksum 0 dropped due to no socket 0 broadcast/multicast datagrams dropped due to no socket 0 dropped due to full socket buffers 0 not for hashed pcb 33 delivered 33 datagrams output ip: 64033566 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with ip length > max ip packet size 0 with header length < data size 0 with data length < header length 0 with bad options 0 with incorrect version number 0 fragments received 0 fragments dropped (dup or out of space) 0 fragments dropped after timeout 0 packets reassembled ok 3079 packets for this host 0 packets for unknown/unsupported protocol 64030433 packets forwarded (64030433 packets fast forwarded) 0 packets not forwardable 0 packets received for unknown multicast group 0 redirects sent 3295 packets sent from this host 0 packets sent with fabricated ip header 0 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 fragments created 0 datagrams that can't be fragmented 0 tunneling packets that can't find gif 0 datagrams with bad address in header