Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Dec 1999 02:56:17 -0500 (EST)
From:      Bill Paul <wpaul@skynet.ctr.columbia.edu>
To:        dillon@apollo.backplane.com (Matthew Dillon)
Cc:        current@freebsd.org
Subject:   Re: Woa!  May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)
Message-ID:  <199912240756.CAA12607@skynet.ctr.columbia.edu>
In-Reply-To: <199912240716.XAA10120@apollo.backplane.com> from "Matthew Dillon" at Dec 23, 99 11:16:46 pm

next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
>     I'm trying to narrow down the area enough that I can mess with the 
>     driver myself and hopefully locate the problem, since it can't be
>     reproduced easily.   I was hoping the magic number 64 could be
>     related to something - and you have apparently been able to do that,
>     which gives me a place to start anyway.   netstat shows the trigger
>     to be the reception of 64 packets rather then the transmission, though.
>     Is there anything at all about the number 64 that could be related to
>     the receiver?

64 is also the number of descriptors/buffers in the RX ring. When you
fill up the RX ring, the chip is supposed to generate a 'no RX buffer
available' interrupt. The driver will check the RX ring for packets
when either an 'RX OK' or 'no RX buffers available' interrupt is
delivered, but you should be getting an 'RX OK' interrupt on every
received packet.

The datasheet for the PNIC II is at:

http://www.freebsd.org/~wpaul/Macronix/PNIC_II.PDF

This is the datasheet LinkSys gave me when they first came out with
the LNE100TX v2.0 board. It's very similar to the Macronix 98715A
datasheet.
 
>     I'm pretty sure that the box was getiting receive interrupts because
>     every time I sent a packet to it from the outside systat -vm showed
>     a PCI interrupt for the network device.  However 'netstat -in 1' did
>     not show the statistics for the received packets until 64 had 
>     accumulated.  It could be that the statistics are not being accumulated
>     on a per-reception basis and that the receive packets are actually
>     getting through, and that its the transmit side which is broken.  I don't
>     know the code well enough yet to make the determination.

The dc_rxeof() routine is what increments ifp->if_ipackets, so if
netstat -in doesn't show any change until after 64 packets have arrived,
then it isn't getting the 'RX OK' interrupts. But I promise you that I
have never seen a condition where 'RX OK' interrupts failed to arrive
even though 'no RX buffer available' interrupts did. The interrupt handler
re-enables interrupts just before it exits, so there should never be a
case where interrupts are turned off and never turned back on again.

-Bill

>     I'll try that next time the problem occurs but I doubt it will have 
>     any effect.  Changing the duplex mode does not appear to reset the port 
>     whereas forcing the media to 'auto' does appear to reset the port.  This 
>     is actually another problem (switches don't appear to pick up the duplex
>     change if the port isn't reset), but not one I'm concerned with.

In general what you want to do is a) switch modes and b) reset the link
so that the guy on the other side re-senses the media. However both sides
can only agree on the duplex setting as the result of an NWAY autoneg
session: if you manually select 100baseTX full duplex, the link partner
can only sense the link speed (100mbs as opposed to 10) but not the
duplex mode. The rule is that if you don't have NWAY but can sense the
link speed, you default to half duplex and let the operator manually
fix things if necessary (that's what operators are for). Of course this
only works if the switch has a management interface that allows you
to configure things like that. Some don't, which can make your life tough.

I'm pretty sure the speed and duplex setting don't really have anything
to do with this particular problem though. I was just wondering why
renegotiating the media would have any effect. It's possible that
dc_init() may be called in there somewhere, which could be resetting
all of the driver state.

-Bill

-- 
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager, Master of Unix-Fu
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=============================================================================


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199912240756.CAA12607>