From owner-freebsd-current Thu Dec 23 23:52:25 1999 Delivered-To: freebsd-current@freebsd.org Received: from skynet.ctr.columbia.edu (skynet.ctr.columbia.edu [128.59.64.70]) by hub.freebsd.org (Postfix) with SMTP id 2C68215048 for ; Thu, 23 Dec 1999 23:52:18 -0800 (PST) (envelope-from wpaul@skynet.ctr.columbia.edu) Received: (from wpaul@localhost) by skynet.ctr.columbia.edu (8.6.12/8.6.9) id CAA12607; Fri, 24 Dec 1999 02:56:18 -0500 From: Bill Paul Message-Id: <199912240756.CAA12607@skynet.ctr.columbia.edu> Subject: Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents) To: dillon@apollo.backplane.com (Matthew Dillon) Date: Fri, 24 Dec 1999 02:56:17 -0500 (EST) Cc: current@freebsd.org In-Reply-To: <199912240716.XAA10120@apollo.backplane.com> from "Matthew Dillon" at Dec 23, 99 11:16:46 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Content-Length: 4276 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: > I'm trying to narrow down the area enough that I can mess with the > driver myself and hopefully locate the problem, since it can't be > reproduced easily. I was hoping the magic number 64 could be > related to something - and you have apparently been able to do that, > which gives me a place to start anyway. netstat shows the trigger > to be the reception of 64 packets rather then the transmission, though. > Is there anything at all about the number 64 that could be related to > the receiver? 64 is also the number of descriptors/buffers in the RX ring. When you fill up the RX ring, the chip is supposed to generate a 'no RX buffer available' interrupt. The driver will check the RX ring for packets when either an 'RX OK' or 'no RX buffers available' interrupt is delivered, but you should be getting an 'RX OK' interrupt on every received packet. The datasheet for the PNIC II is at: http://www.freebsd.org/~wpaul/Macronix/PNIC_II.PDF This is the datasheet LinkSys gave me when they first came out with the LNE100TX v2.0 board. It's very similar to the Macronix 98715A datasheet. > I'm pretty sure that the box was getiting receive interrupts because > every time I sent a packet to it from the outside systat -vm showed > a PCI interrupt for the network device. However 'netstat -in 1' did > not show the statistics for the received packets until 64 had > accumulated. It could be that the statistics are not being accumulated > on a per-reception basis and that the receive packets are actually > getting through, and that its the transmit side which is broken. I don't > know the code well enough yet to make the determination. The dc_rxeof() routine is what increments ifp->if_ipackets, so if netstat -in doesn't show any change until after 64 packets have arrived, then it isn't getting the 'RX OK' interrupts. But I promise you that I have never seen a condition where 'RX OK' interrupts failed to arrive even though 'no RX buffer available' interrupts did. The interrupt handler re-enables interrupts just before it exits, so there should never be a case where interrupts are turned off and never turned back on again. -Bill > I'll try that next time the problem occurs but I doubt it will have > any effect. Changing the duplex mode does not appear to reset the port > whereas forcing the media to 'auto' does appear to reset the port. This > is actually another problem (switches don't appear to pick up the duplex > change if the port isn't reset), but not one I'm concerned with. In general what you want to do is a) switch modes and b) reset the link so that the guy on the other side re-senses the media. However both sides can only agree on the duplex setting as the result of an NWAY autoneg session: if you manually select 100baseTX full duplex, the link partner can only sense the link speed (100mbs as opposed to 10) but not the duplex mode. The rule is that if you don't have NWAY but can sense the link speed, you default to half duplex and let the operator manually fix things if necessary (that's what operators are for). Of course this only works if the switch has a management interface that allows you to configure things like that. Some don't, which can make your life tough. I'm pretty sure the speed and duplex setting don't really have anything to do with this particular problem though. I was just wondering why renegotiating the media would have any effect. It's possible that dc_init() may be called in there somewhere, which could be resetting all of the driver state. -Bill -- ============================================================================= -Bill Paul (212) 854-6020 | System Manager, Master of Unix-Fu Work: wpaul@ctr.columbia.edu | Center for Telecommunications Research Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City ============================================================================= "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" ============================================================================= To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message