From owner-freebsd-bugs Wed Jun 24 13:10:37 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id NAA18075 for freebsd-bugs-outgoing; Wed, 24 Jun 1998 13:10:37 -0700 (PDT) (envelope-from owner-freebsd-bugs@FreeBSD.ORG) Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id NAA18009 for ; Wed, 24 Jun 1998 13:10:19 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.8.8/8.8.5) id NAA24224; Wed, 24 Jun 1998 13:10:03 -0700 (PDT) Received: from ms.ha.md.us (ha.ha.md.us [192.55.203.244]) by hub.freebsd.org (8.8.8/8.8.8) with SMTP id NAA17089 for ; Wed, 24 Jun 1998 13:04:29 -0700 (PDT) (envelope-from mike@ms.ha.md.us) Message-Id: <9806242201.aa01253@ms.ms.ha.md.us> Date: Wed, 24 Jun 98 16:01:09 EDT From: mike@ms.ha.md.us Reply-To: mike@ms.ha.md.us To: FreeBSD-gnats-submit@FreeBSD.ORG, mike@ms.ha.md.us X-Send-Pr-Version: 3.2 Subject: i386/7057: 3Com3C509: lockups & high packet latency Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 7057 >Category: i386 >Synopsis: 3Com 3C509 locks up, or has >1000ms rtt under 100pps load of RDUMP. >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Wed Jun 24 13:10:03 PDT 1998 >Last-Modified: >Originator: Mike Muuss >Organization: Home >Release: FreeBSD 3.0-980222-SNAP i386 >Environment: AMD 486DX4/100, Asus SP3G PCI motherboard, 64 MBytes of RAM. 1 3C5x9 board(s) on ISA found at 0x300 ep0 at 0x300-0x30f irq 11 on isa ep0: aui/utp/bnc[*BNC*] address 00:60:08:27:e4:95 Only other 2 hosts on this ethernet are ms.ha.md.us: a P233 running BSD/OS 2.1 unix, and nt.ha.md.us: a P150 dual-booting Win95 & Linux. The FreeBSD machine is temporarily named ha.ha.md.us. >Description: While running RDUMP from the FreeBSD machine over the local ethernet to the BSD/OS machine's Exabyte-8200 tape drive, two different (related?) problems are observed: (1) The ethernet interface on teh FreeBSD system "locks up". Total ping loss, kernel queue limit of 50 is exceeded resulting in "No more buffer space" errors. ifconfig down followed by ifconfig up successfully restarted the interface. (2) Very high round-trip-times observed between the two machines. BSD/OS and Win95 machines can ping each other just fine, but the problem is observed between both the BSD/OS & FreeBSD system and the Win95 and FreeBSD system, pointing to trouble in the FreeBSD system. Here is the evidence I collected: 1 ms> ping ha PING ha.ha.md.us (192.55.203.244): 56 data bytes 64 bytes from 192.55.203.244: icmp_seq=0 ttl=255 time=3.332 ms 64 bytes from 192.55.203.244: icmp_seq=1 ttl=255 time=60.674 ms 64 bytes from 192.55.203.244: icmp_seq=2 ttl=255 time=60.558 ms 64 bytes from 192.55.203.244: icmp_seq=3 ttl=255 time=5.72 ms 64 bytes from 192.55.203.244: icmp_seq=4 ttl=255 time=1007.02 ms 64 bytes from 192.55.203.244: icmp_seq=5 ttl=255 time=7.803 ms 64 bytes from 192.55.203.244: icmp_seq=6 ttl=255 time=1010.03 ms 64 bytes from 192.55.203.244: icmp_seq=7 ttl=255 time=11.579 ms 64 bytes from 192.55.203.244: icmp_seq=8 ttl=255 time=1005.51 ms 64 bytes from 192.55.203.244: icmp_seq=9 ttl=255 time=6.384 ms 64 bytes from 192.55.203.244: icmp_seq=10 ttl=255 time=1007.31 ms 64 bytes from 192.55.203.244: icmp_seq=11 ttl=255 time=8.188 ms 64 bytes from 192.55.203.244: icmp_seq=12 ttl=255 time=0.617 ms 64 bytes from 192.55.203.244: icmp_seq=13 ttl=255 time=10.679 ms 64 bytes from 192.55.203.244: icmp_seq=14 ttl=255 time=13.146 ms 64 bytes from 192.55.203.244: icmp_seq=15 ttl=255 time=0.629 ms ^C --- ha.ha.md.us ping statistics --- 16 packets transmitted, 16 packets received, 0% packet loss round-trip min/avg/max = 0.617/263.698/1010.03 ms Note the high variation of round trip times. Packets are getting stuck for a full second, and then kicked loose somehow. You can see the consequences of this problem on the data flow of the RDUMP. This is from the point of view of the receiving (BSD/OS) system: 4 ms> netstat -i -I ef0 1 input (ef0) output input (Total) output packets errs packets errs colls packets errs packets errs colls 251315 3 310068 0 39103 5504076 50 5794021 0 39103 43 0 23 0 35 43 0 23 0 35 104 0 55 0 12 104 0 55 0 12 0 0 1 0 26 2 0 2 0 26 168 0 86 0 10 171 0 89 0 10 0 0 1 0 45 1 0 3 0 45 72 0 38 0 11 76 0 42 0 11 6 0 3 0 12 12 0 10 0 12 18 0 10 0 8 26 0 17 0 8 103 0 55 0 14 109 0 58 0 14 6 0 3 0 30 9 0 9 0 30 22 0 12 0 8 22 0 12 0 8 117 0 60 0 12 119 0 63 0 12 And here is the view from the sending (FreeBSD) system: 2 ha ENC> netstat -i -I ep0 1 input (ep0) output packets errs bytes packets errs bytes colls 12 0 721 21 0 34311 0 7 0 421 12 0 25125 0 0 0 0 0 0 1652 0 47 0 2822 79 0 94018 0 5 0 301 8 0 19069 0 3 0 180 18 0 13764 0 27 0 1621 33 0 47835 0 14 0 841 25 0 34209 0 5 0 301 8 0 19069 0 19 0 1182 30 0 27624 0 52 0 3222 85 0 125010 0 27 0 1718 42 0 28784 0 75 0 4631 125 0 170565 0 9 0 624 15 0 1358 0 112 0 6882 207 0 273682 0 55 0 3388 96 0 136218 0 75 0 4520 147 0 204692 0 31 0 1946 44 0 68778 0 14 0 966 9 0 912 0 39 0 2385 96 0 103189 0 I own a half dozen of these 3C509 cards, and they are rock solid and fast performers on all my other systems. I'll do some hardware swapping and other experimenting tomorrow, but this looks like a driver bug. >How-To-Repeat: Run RDUMP out a 3C509 card, then run some pings. I'll try to reproduce using TTCP as well. >Fix: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message