Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Dec 1999 22:32:43 -0500 (EST)
From:      Bill Paul <wpaul@skynet.ctr.columbia.edu>
To:        dillon@apollo.backplane.com (Matthew Dillon)
Cc:        julian@whistle.com, scottm@cs.ucla.edu, jlemon@americantv.com, brad@shub-internet.org, jabley@patho.gen.nz, phk@critter.freebsd.dk, wollman@khavrinen.lcs.mit.edu, current@freebsd.org
Subject:   Re: Woa!  May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)
Message-ID:  <199912240332.WAA12081@skynet.ctr.columbia.edu>
In-Reply-To: <199912232103.NAA07664@apollo.backplane.com> from "Matthew Dillon" at Dec 23, 99 01:03:30 pm

next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

>     Heh heh.  Sorry about this, I believe I have further information on
>     another older problem.  Bill, remember those ethernet lockups I was 
>     having with the 'xl' driver all those months ago that we could never
>     track down?

And remember how I kept telling you that I could never duplicate the
problem here?

>     Well, they happen with the 'dc' driver too.  But this time I'm not getting
>     a complete lockup.  The network actually continues to work well enough,
>     well, just barely well enough, that I can still use it.  slowly.
> 
>     It appears that the 'dc' driver continues to take receive interrupts
>     (see the systat -vm snapshot at the end), but winds up not processing 
>     any of the packets.  Except when 64 packets accumulate then suddenly all
>     64 get processed all at once!  Then nothing again until the next 64
>     accumulate.

Uh. That's... strange. First of all, you haven't said if this is the
same machine that experienced the problems with the xl driver. Second,
the number 64 sticks out in this case. If you look at if_dc.c (uh...
you did actually look at the code, right?), you'll see that dc_encap()
will only ask for a "TX done" interrupt every 64 packets. Why? Well,
reclaiming transmit buffers is a fairly unimportant task and I wanted to 
cut down on the number of interrupts that were generated, and when the
tulip reaches the last descriptor in a transmit chain, it's supposed
to generate a "no more buffers in TX ring" interrupt, which will also
trigger a TX buffer reclamation (i.e. dc_txeof() will be called for
either interrupt).

This behavior is controlled by the DC_TX_USE_TX_INTR flag, which
is set for the PNIC II chip. I also use the DC_TX_POLL flag, which
means that the chip is programmed to poll the TX ring and start
transmission itself rather than having the driver write to the
TX DMA start register. This means no register accesses on transmit,
which is always nice. You can ask for a "TX done" interrupt to be
scheduled for each transmitted packet by using the DC_TX_INTR_ALWAYS
flag, which is currently only used for the PNIC I (82c168/82c169)
because it blows goats.

Anyway. I *never* see this behavior on any of my test machines. I
have a LinkSys LNE100TX V2.0 card with the 82c115 chip, as well
as a couple of Macronix cards, a Davicom card, several Intel/DEC
21143 cards, ASIX cards and ADMtek cards, and PNIC I-based LinkSys
cards. None of them exhibit this behavior when I test them.

>     This netstat is on the machine with the 'dc' driver that locked up, when
>     I ping it from another machine.  The 'dc' driver still works--- barely.
>     It doesn't processes any packets until 64 have been received, then it
>     processes them all at once.  The transmit side appears to work fine and
>     the receive side appears to get interrupts but does not appear to process
>     incoming packets.  Yet, obviously, the packets are being accumulated 
>     somewhere because I don't have any packet loss, just incredibly long and
>     odd ping times.

No no no. You can't say "the receive side appears to get interrupts."
That's speculation. You can stare at the machine and theorize about
what appears to be happening all you want: it won't do a damn bit of good 
until you actually test your theory. You know that an "RX done" interrupt
has been delivered if dc_rxeof() is called. So do something to verify
that it's being called: stick a printf() in dc_rxeof() that tells you
when it trips. Then duplicate the behavior and watch what happens.

>     This occurs when I am running netscape on the same box over a remote X
>     connection (read:  Lots of packets going over the network plus lots of
>     local PCI activity talking to the graphics card).  Same problem occurs 
>     with different graphics adapters but I believe this same problem also
>     occured with the 'xl' driver on the card I had in before I put this
>     card in.

Yes, but the one vital fact you keep leaving out is: does this always
happen with the same machine. If so, then describe this machine. What
PCI chipset does it have? And more to the point, what cards have you
used in this machine that *didn't* exhibit this problem.

No wait, let me guess: Intel fxp. Right? Grrrr.

I'm very puzzled by the fact that nobody else has *ever* reported
any problem even remotely like this. Of course, with the level of
feedback I get, it's possible that 50 people are having the same
problem and simply never bothered to tell me.

>     And watch what happens after I managed to 'ifconfig dc0 media auto',
>     it goes back to normal... suddenly everything is working properly
>     again.

And what happens if instead of auto, you use "ifconfg dc0 media 100baseTX
mediaopt full-duplex" to lock the media setting down? Or what happens if
you shut down and restart the X server?

-Bill

-- 
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager, Master of Unix-Fu
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=============================================================================


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199912240332.WAA12081>