Date: Thu, 23 Dec 1999 22:32:43 -0500 (EST) From: Bill Paul <wpaul@skynet.ctr.columbia.edu> To: dillon@apollo.backplane.com (Matthew Dillon) Cc: julian@whistle.com, scottm@cs.ucla.edu, jlemon@americantv.com, brad@shub-internet.org, jabley@patho.gen.nz, phk@critter.freebsd.dk, wollman@khavrinen.lcs.mit.edu, current@freebsd.org Subject: Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents) Message-ID: <199912240332.WAA12081@skynet.ctr.columbia.edu> In-Reply-To: <199912232103.NAA07664@apollo.backplane.com> from "Matthew Dillon" at Dec 23, 99 01:03:30 pm
next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: > Heh heh. Sorry about this, I believe I have further information on > another older problem. Bill, remember those ethernet lockups I was > having with the 'xl' driver all those months ago that we could never > track down? And remember how I kept telling you that I could never duplicate the problem here? > Well, they happen with the 'dc' driver too. But this time I'm not getting > a complete lockup. The network actually continues to work well enough, > well, just barely well enough, that I can still use it. slowly. > > It appears that the 'dc' driver continues to take receive interrupts > (see the systat -vm snapshot at the end), but winds up not processing > any of the packets. Except when 64 packets accumulate then suddenly all > 64 get processed all at once! Then nothing again until the next 64 > accumulate. Uh. That's... strange. First of all, you haven't said if this is the same machine that experienced the problems with the xl driver. Second, the number 64 sticks out in this case. If you look at if_dc.c (uh... you did actually look at the code, right?), you'll see that dc_encap() will only ask for a "TX done" interrupt every 64 packets. Why? Well, reclaiming transmit buffers is a fairly unimportant task and I wanted to cut down on the number of interrupts that were generated, and when the tulip reaches the last descriptor in a transmit chain, it's supposed to generate a "no more buffers in TX ring" interrupt, which will also trigger a TX buffer reclamation (i.e. dc_txeof() will be called for either interrupt). This behavior is controlled by the DC_TX_USE_TX_INTR flag, which is set for the PNIC II chip. I also use the DC_TX_POLL flag, which means that the chip is programmed to poll the TX ring and start transmission itself rather than having the driver write to the TX DMA start register. This means no register accesses on transmit, which is always nice. You can ask for a "TX done" interrupt to be scheduled for each transmitted packet by using the DC_TX_INTR_ALWAYS flag, which is currently only used for the PNIC I (82c168/82c169) because it blows goats. Anyway. I *never* see this behavior on any of my test machines. I have a LinkSys LNE100TX V2.0 card with the 82c115 chip, as well as a couple of Macronix cards, a Davicom card, several Intel/DEC 21143 cards, ASIX cards and ADMtek cards, and PNIC I-based LinkSys cards. None of them exhibit this behavior when I test them. > This netstat is on the machine with the 'dc' driver that locked up, when > I ping it from another machine. The 'dc' driver still works--- barely. > It doesn't processes any packets until 64 have been received, then it > processes them all at once. The transmit side appears to work fine and > the receive side appears to get interrupts but does not appear to process > incoming packets. Yet, obviously, the packets are being accumulated > somewhere because I don't have any packet loss, just incredibly long and > odd ping times. No no no. You can't say "the receive side appears to get interrupts." That's speculation. You can stare at the machine and theorize about what appears to be happening all you want: it won't do a damn bit of good until you actually test your theory. You know that an "RX done" interrupt has been delivered if dc_rxeof() is called. So do something to verify that it's being called: stick a printf() in dc_rxeof() that tells you when it trips. Then duplicate the behavior and watch what happens. > This occurs when I am running netscape on the same box over a remote X > connection (read: Lots of packets going over the network plus lots of > local PCI activity talking to the graphics card). Same problem occurs > with different graphics adapters but I believe this same problem also > occured with the 'xl' driver on the card I had in before I put this > card in. Yes, but the one vital fact you keep leaving out is: does this always happen with the same machine. If so, then describe this machine. What PCI chipset does it have? And more to the point, what cards have you used in this machine that *didn't* exhibit this problem. No wait, let me guess: Intel fxp. Right? Grrrr. I'm very puzzled by the fact that nobody else has *ever* reported any problem even remotely like this. Of course, with the level of feedback I get, it's possible that 50 people are having the same problem and simply never bothered to tell me. > And watch what happens after I managed to 'ifconfig dc0 media auto', > it goes back to normal... suddenly everything is working properly > again. And what happens if instead of auto, you use "ifconfg dc0 media 100baseTX mediaopt full-duplex" to lock the media setting down? Or what happens if you shut down and restart the X server? -Bill -- ============================================================================= -Bill Paul (212) 854-6020 | System Manager, Master of Unix-Fu Work: wpaul@ctr.columbia.edu | Center for Telecommunications Research Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City ============================================================================= "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" ============================================================================= To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199912240332.WAA12081>