Date: Wed, 20 Feb 2008 11:35:34 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: Unga <unga888@yahoo.com> Cc: mux@FreeBSD.org, freebsd-current@freebsd.org Subject: Re: Frequent network access freeze (in 7.0) Message-ID: <20080220112911.W44565@fledge.watson.org> In-Reply-To: <235549.36535.qm@web57008.mail.re3.yahoo.com> References: <235549.36535.qm@web57008.mail.re3.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 20 Feb 2008, Unga wrote: > I'm running 7.0-PRERELEASE (RC2, dated 15/02/2008), compiled from sources on > i386 machine (512MB RAM, 3.0GHz, tx0: <SMC EtherPower II 10/100>). > > Network access freezes very frequently. Cannot ping to any ip address. The > only way to get networking working again is reboot. > > I'm having this problem on 7.0 ever since I tried it from BETA4. I have > reported also to this list before but sadly nobody was interested on it. > > If somebody is interested to look into this problem, I could furnish with > more detail and participate in testing. This sort of problem frequently turns out to be a bug in a device driver or a problem with interrupt probing/configuration, so my first guess would be a problem with the if_tx driver. The usual starting diagnostics when ping fails are to try to use tcpdump to determine whether it's receive or transmit failing (or both). Quiet the network between two endpoints as much as you can so you can avoid noise from making the dumps more complex, and dump arp and icmp at both endpoints. Now try to ping from each end point to the other. One potential source of confusion is that ping requires ARP to work, and ARP can be a slightly confusing protocol as it usually resolves actively (query, response) but sometimes it receives passive updates or extends existing entries. What you want to look for is a packet sent by one side that isn't received by the other. You might find, for example, that your host receives packets fine, but the packets it transmits are never received. This would be indicative of a driver bug in which it fails to properly handle (for example) transmit queues filling, and might only trigger under very high load. Or, you might find that your host never receives anything the other side transmits, but can send fine. This might be indicative of a driver bug involving the receive code, or a problem with how interrupts are being handled more generally. It looks like the last non-routine maintenance to the driver was done by Maxime in about 2003; the more recent changes have all been updates to newbus/busdma infrastructure, ifnet changes, locking changes, etc. I've CC'd him as it sounds like he may have hardware... My advice would be to do the above tests and see if you can narrow down whether it's transmit, receive, or both failing. Robert N M Watson Computer Laboratory University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080220112911.W44565>