Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Jun 98 16:01:09 EDT
From:      mike@ms.ha.md.us
To:        FreeBSD-gnats-submit@FreeBSD.ORG, mike@ms.ha.md.us
Subject:   i386/7056: 3Com3C509: lockups & high packet latency
Message-ID:  <9806242201.aa01253@ms.ms.ha.md.us>

next in thread | raw e-mail | index | archive | help

>Number:         7056
>Category:       i386
>Synopsis:       3Com 3C509 locks up, or has >1000ms rtt under 100pps load of RDUMP.
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:
>Keywords:
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jun 24 13:10:01 PDT 1998
>Last-Modified:
>Originator:     Mike Muuss <mike>
>Organization:
Home
>Release:        FreeBSD 3.0-980222-SNAP i386
>Environment:

AMD 486DX4/100, Asus SP3G PCI motherboard, 64 MBytes of RAM.

1 3C5x9 board(s) on ISA found at 0x300
ep0 at 0x300-0x30f irq 11 on isa
ep0: aui/utp/bnc[*BNC*] address 00:60:08:27:e4:95

Only other 2 hosts on this ethernet are ms.ha.md.us: a P233 running
BSD/OS 2.1 unix, and nt.ha.md.us: a P150 dual-booting Win95 & Linux.
The FreeBSD machine is temporarily named ha.ha.md.us.

>Description:

While running RDUMP from the FreeBSD machine over the local ethernet to
the BSD/OS machine's Exabyte-8200 tape drive, two different (related?)
problems are observed:

(1)  The ethernet interface on teh FreeBSD system "locks up".  Total
ping loss, kernel queue limit of 50 is exceeded resulting in "No more
buffer space" errors.  ifconfig down followed by ifconfig up successfully
restarted the interface.

(2)  Very high round-trip-times observed between the two machines.
BSD/OS and Win95 machines can ping each other just fine, but the problem
is observed between both the BSD/OS & FreeBSD system and the Win95 and
FreeBSD system, pointing to trouble in the FreeBSD system.

Here is the evidence I collected:

1 ms> ping ha
PING ha.ha.md.us (192.55.203.244): 56 data bytes
64 bytes from 192.55.203.244: icmp_seq=0 ttl=255 time=3.332 ms
64 bytes from 192.55.203.244: icmp_seq=1 ttl=255 time=60.674 ms
64 bytes from 192.55.203.244: icmp_seq=2 ttl=255 time=60.558 ms
64 bytes from 192.55.203.244: icmp_seq=3 ttl=255 time=5.72 ms
64 bytes from 192.55.203.244: icmp_seq=4 ttl=255 time=1007.02 ms
64 bytes from 192.55.203.244: icmp_seq=5 ttl=255 time=7.803 ms
64 bytes from 192.55.203.244: icmp_seq=6 ttl=255 time=1010.03 ms
64 bytes from 192.55.203.244: icmp_seq=7 ttl=255 time=11.579 ms
64 bytes from 192.55.203.244: icmp_seq=8 ttl=255 time=1005.51 ms
64 bytes from 192.55.203.244: icmp_seq=9 ttl=255 time=6.384 ms
64 bytes from 192.55.203.244: icmp_seq=10 ttl=255 time=1007.31 ms
64 bytes from 192.55.203.244: icmp_seq=11 ttl=255 time=8.188 ms
64 bytes from 192.55.203.244: icmp_seq=12 ttl=255 time=0.617 ms
64 bytes from 192.55.203.244: icmp_seq=13 ttl=255 time=10.679 ms
64 bytes from 192.55.203.244: icmp_seq=14 ttl=255 time=13.146 ms
64 bytes from 192.55.203.244: icmp_seq=15 ttl=255 time=0.629 ms
^C
--- ha.ha.md.us ping statistics ---
16 packets transmitted, 16 packets received, 0% packet loss
round-trip min/avg/max = 0.617/263.698/1010.03 ms

Note the high variation of round trip times.  Packets are getting stuck
for a full second, and then kicked loose somehow.

You can see the consequences of this problem on the data flow of the
RDUMP.  This is from the point of view of the receiving (BSD/OS) system:

4 ms> netstat -i -I ef0 1
   input    (ef0)     output            input   (Total)    output
  packets  errs   packets  errs colls    packets  errs   packets  errs colls
  251315     3   310068     0 39103   5504076    50  5794021     0 39103
      43     0       23     0    35        43     0       23     0    35
     104     0       55     0    12       104     0       55     0    12
       0     0        1     0    26         2     0        2     0    26
     168     0       86     0    10       171     0       89     0    10
       0     0        1     0    45         1     0        3     0    45
      72     0       38     0    11        76     0       42     0    11
       6     0        3     0    12        12     0       10     0    12
      18     0       10     0     8        26     0       17     0     8
     103     0       55     0    14       109     0       58     0    14
       6     0        3     0    30         9     0        9     0    30
      22     0       12     0     8        22     0       12     0     8
     117     0       60     0    12       119     0       63     0    12

And here is the view from the sending (FreeBSD) system:

2 ha ENC> netstat -i -I ep0 1
            input          (ep0)           output
   packets  errs      bytes    packets  errs      bytes colls
        12     0        721         21     0      34311     0
         7     0        421         12     0      25125     0
         0     0          0          0     0       1652     0
        47     0       2822         79     0      94018     0
         5     0        301          8     0      19069     0
         3     0        180         18     0      13764     0
        27     0       1621         33     0      47835     0
        14     0        841         25     0      34209     0
         5     0        301          8     0      19069     0
        19     0       1182         30     0      27624     0
        52     0       3222         85     0     125010     0
        27     0       1718         42     0      28784     0
        75     0       4631        125     0     170565     0
         9     0        624         15     0       1358     0
       112     0       6882        207     0     273682     0
        55     0       3388         96     0     136218     0
        75     0       4520        147     0     204692     0
        31     0       1946         44     0      68778     0
        14     0        966          9     0        912     0
        39     0       2385         96     0     103189     0

I own a half dozen of these 3C509 cards, and they are rock solid and
fast performers on all my other systems.  I'll do some hardware swapping
and other experimenting tomorrow, but this looks like a driver bug.

>How-To-Repeat:

Run RDUMP out a 3C509 card, then run some pings.  I'll try to reproduce
using TTCP as well.

>Fix:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9806242201.aa01253>