From owner-freebsd-current@FreeBSD.ORG Tue Feb 26 11:52:53 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2653C1065678 for ; Tue, 26 Feb 2008 11:52:53 +0000 (UTC) (envelope-from unga888@yahoo.com) Received: from web57003.mail.re3.yahoo.com (web57003.mail.re3.yahoo.com [66.196.97.107]) by mx1.freebsd.org (Postfix) with SMTP id E744E13D797 for ; Tue, 26 Feb 2008 09:57:54 +0000 (UTC) (envelope-from unga888@yahoo.com) Received: (qmail 15444 invoked by uid 60001); 26 Feb 2008 09:57:53 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=MuZIP+T1nr3E1dGHJnzHq2Tqvu+vqbMTFTc50dN+aHZLBa83FnW5Taa8pCanZQV+QhXqc7UweKgyw1fbB0RHsOvIRC69CfChjiSE2ThS5yQEMmRErEjNvcUDRkKFASALJxm+kfhWsT4HMoDll28HAAQ8FCV5lEE+NG2W79zBCnc=; X-YMail-OSG: AEVKWuEVM1nnFXzUwGuUVaXb8eRLwvOjsWdLSBNS8vR3xNV50SnjeKf.d09EbMOc8Q-- Received: from [165.21.155.73] by web57003.mail.re3.yahoo.com via HTTP; Tue, 26 Feb 2008 01:57:53 PST Date: Tue, 26 Feb 2008 01:57:53 -0800 (PST) From: Unga To: freebsd-current@freebsd.org In-Reply-To: <20080220112911.W44565@fledge.watson.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <840364.15353.qm@web57003.mail.re3.yahoo.com> Cc: mux@FreeBSD.org, rwatson@FreeBSD.org Subject: Re: Frequent network access freeze (in 7.0) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 11:52:57 -0000 --- Robert Watson wrote: > > On Wed, 20 Feb 2008, Unga wrote: > > > I'm running 7.0-PRERELEASE (RC2, dated > 15/02/2008), compiled from sources on > > i386 machine (512MB RAM, 3.0GHz, tx0: EtherPower II 10/100>). > > > > Network access freezes very frequently. Cannot > ping to any ip address. The > > only way to get networking working again is > reboot. > > > > I'm having this problem on 7.0 ever since I tried > it from BETA4. I have > > reported also to this list before but sadly nobody > was interested on it. > > > > If somebody is interested to look into this > problem, I could furnish with > > more detail and participate in testing. > > This sort of problem frequently turns out to be a > bug in a device driver or a > problem with interrupt probing/configuration, so my > first guess would be a > problem with the if_tx driver. The usual starting > diagnostics when ping fails > are to try to use tcpdump to determine whether it's > receive or transmit > failing (or both). Quiet the network between two > endpoints as much as you can > so you can avoid noise from making the dumps more > complex, and dump arp and > icmp at both endpoints. Now try to ping from each > end point to the other. > One potential source of confusion is that ping > requires ARP to work, and ARP > can be a slightly confusing protocol as it usually > resolves actively (query, > response) but sometimes it receives passive updates > or extends existing > entries. > > What you want to look for is a packet sent by one > side that isn't received by > the other. You might find, for example, that your > host receives packets fine, > but the packets it transmits are never received. > This would be indicative of a > driver bug in which it fails to properly handle (for > example) transmit queues > filling, and might only trigger under very high > load. Or, you might find that > your host never receives anything the other side > transmits, but can send fine. > This might be indicative of a driver bug involving > the receive code, or a > problem with how interrupts are being handled more > generally. > > It looks like the last non-routine maintenance to > the driver was done by > Maxime in about 2003; the more recent changes have > all been updates to > newbus/busdma infrastructure, ifnet changes, locking > changes, etc. I've CC'd > him as it sounds like he may have hardware... My > advice would be to do the > above tests and see if you can narrow down whether > it's transmit, receive, or > both failing. > Here are the detail when net access is working and when not working: When net access working ----------------------- $ ifconfig tx0: flags=108843 metric 0 mtu 1500 options=8 ether 00:e1:20:34:bb:36 inet 192.168.1.20 netmask 0xffffff00 broadcast 192.168.1.255 media: Ethernet autoselect (10baseT/UTP) status: active plip0: flags=108810 metric 0 mtu 1500 lo0: flags=8049 metric 0 mtu 16384 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 $ netstat -r Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire default 192.168.1.1 UGS 0 1090 tx0 localhost localhost UH 0 186 lo0 192.168.1.0 link#1 UC 0 0 tx0 192.168.1.1 00:91:d2:4c:54:f8 UHLW 2 0 tx0 892 Internet6: Destination Gateway Flags Netif Expire localhost localhost UHL lo0 fe80::%lo0 fe80::1%lo0 U lo0 fe80::1%lo0 link#3 UHL lo0 ff01:3:: fe80::1%lo0 UC lo0 ff02::%lo0 fe80::1%lo0 UC lo0 When net access NOT working --------------------------- $ ifconfig tx0: flags=108843 metric 0 mtu 1500 options=8 ether 00:e1:20:34:bb:36 inet 192.168.1.20 netmask 0xffffff00 broadcast 192.168.1.255 media: Ethernet autoselect (10baseT/UTP) status: active plip0: flags=108810 metric 0 mtu 1500 lo0: flags=8049 metric 0 mtu 16384 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 $ netstat -r Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire default 192.168.1.1 UGS 0 3338 tx0 localhost localhost UH 0 204 lo0 192.168.1.0 link#1 UC 0 0 tx0 192.168.1.1 00:91:d2:4c:54:f8 UHLW 2 28 tx0 997 192.168.1.2 link#1 UHLW 1 1 tx0 Internet6: Destination Gateway Flags Netif Expire localhost localhost UHL lo0 fe80::%lo0 fe80::1%lo0 U lo0 fe80::1%lo0 link#3 UHL lo0 ff01:3:: fe80::1%lo0 UC lo0 ff02::%lo0 fe80::1%lo0 UC lo0 tcpdump -i tx0 -v NOTE: When ping to 192.168.1.1, no tcpdump output. ping 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ^C --- 192.168.1.1 ping statistics --- 58 packets transmitted, 0 packets received, 100.0% packet loss /var/log/messages: Feb 26 15:26:14 blacktower kernel: tx0: ERROR! Can't stop Rx DMA Feb 26 15:26:14 blacktower kernel: tx0: promiscuous mode enabled Note: These two messages keep on repeat on /var/log/messages. /var/log/messages at the time of send this email: Feb 26 17:32:17 blacktower kernel: tx0: link state changed to DOWN Feb 26 17:36:25 blacktower kernel: tx0: link state changed to UP Feb 26 17:36:30 blacktower kernel: tx0: link state changed to DOWN Feb 26 17:37:07 blacktower kernel: tx0: link state changed to UP Feb 26 17:37:14 blacktower kernel: tx0: link state changed to DOWN Feb 26 17:37:22 blacktower kernel: tx0: link state changed to UP When reboot, net access start working again. Please let me know what other information is required. Kind regards Unga ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ