From owner-freebsd-net@FreeBSD.ORG Wed Apr 7 07:17:50 2004 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BA8FA16A4CE; Wed, 7 Apr 2004 07:17:50 -0700 (PDT) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0DCA043D3F; Wed, 7 Apr 2004 07:17:50 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i37EHB5v014436; Thu, 8 Apr 2004 00:17:11 +1000 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i37EH6sg024268; Thu, 8 Apr 2004 00:17:08 +1000 Date: Thu, 8 Apr 2004 00:17:06 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Palle Girgensohn In-Reply-To: <3810000.1081299464@palle.girgensohn.se> Message-ID: <20040407235838.K11719@gamplex.bde.org> References: <20240000.1079394807@palle.girgensohn.se> <3810000.1081299464@palle.girgensohn.se> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@FreeBSD.org cc: net@FreeBSD.org Subject: Re: sk ethernet driver: watchdog timeout X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Apr 2004 14:17:50 -0000 On Wed, 7 Apr 2004, Palle Girgensohn wrote: > --On onsdag, mars 17, 2004 00.21.44 +0100 "Arno J. Klaassen" > wrote: > > > Hello, > > > >> I have an ASUS motherboard A7V8X-E Deluxe with onboard 10/100/1000 > >> Mbit/s NIC from Marvell Semiconductor. > >> > >> My problem is that it sometimes lock up with the error message > >> > >> sk0: watchdog timeout > > > > I have a similar problem with 3Com cards on an ASUS A7N266; > > I just post in case this might be related (and in hope for > > a hint for a solution ) > > Hi again, > > I've since this thread started tried this on more different systems, with > exactly the same results. Anyone else experiencing this? Anything I can do > to help fixing it? The following patch reduces the problem on A7V8X-E a little. It limits the tx queue to 1 packet and fixes handling of the timeout on txeof. The first part probably makes the second part a no-op. Without this, my A7V8X-E hangs on even light nfs activity (e.g., copying a 1MB file to nfs). With it, it takes heavier nfs activity to hang (makeworld never completes, and a flood ping always hangs). I first suspected an interrupt-related bug, but the bug seems to be more hardware-specific. Examination of the output queues shows that the tx sometimes just stops before processing all packets. Resetting in sk_watchdog() doesn't always fix the problem, and the timeout usually stops firing after a couple of unsuccessful resets, giving a completely hung device. But the problem may be related to interrupt timing, since it is much smaller under RELENG_4. RELENG_4 hangs about as often without this hack as -current does with it. nv0 hangs similarly. fxp0 just works. %%% Index: if_sk.c =================================================================== RCS file: /home/ncvs/src/sys/pci/if_sk.c,v retrieving revision 1.78 diff -u -2 -r1.78 if_sk.c --- if_sk.c 31 Mar 2004 12:35:51 -0000 1.78 +++ if_sk.c 1 Apr 2004 07:33:58 -0000 @@ -1830,4 +1830,9 @@ SK_IF_LOCK(sc_if); + if (sc_if->sk_cdata.sk_tx_cnt > 0) { + SK_IF_UNLOCK(sc_if); + return; + } + idx = sc_if->sk_cdata.sk_tx_prod; @@ -1853,4 +1858,5 @@ */ BPF_MTAP(ifp, m_head); + break; } @@ -2000,5 +2031,4 @@ sc_if->sk_cdata.sk_tx_cnt--; SK_INC(idx, SK_TX_RING_CNT); - ifp->if_timer = 0; } @@ -2007,4 +2037,6 @@ if (cur_tx != NULL) ifp->if_flags &= ~IFF_OACTIVE; + + ifp->if_timer = (sc_if->sk_cdata.sk_tx_cnt == 0) ? 0 : 5; return; %%% Bruce