Date: Thu, 8 Apr 2004 00:17:06 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: Palle Girgensohn <girgen@pingpong.net> Cc: net@FreeBSD.org Subject: Re: sk ethernet driver: watchdog timeout Message-ID: <20040407235838.K11719@gamplex.bde.org> In-Reply-To: <3810000.1081299464@palle.girgensohn.se> References: <20240000.1079394807@palle.girgensohn.se> <wpy8q04buf.fsf@heho.snv.jussieu.fr> <3810000.1081299464@palle.girgensohn.se>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 7 Apr 2004, Palle Girgensohn wrote: > --On onsdag, mars 17, 2004 00.21.44 +0100 "Arno J. Klaassen" > <arno@heho.snv.jussieu.fr> wrote: > > > Hello, > > > >> I have an ASUS motherboard A7V8X-E Deluxe with onboard 10/100/1000 > >> Mbit/s NIC from Marvell Semiconductor. > >> > >> My problem is that it sometimes lock up with the error message > >> > >> sk0: watchdog timeout > > > > I have a similar problem with 3Com cards on an ASUS A7N266; > > I just post in case this might be related (and in hope for > > a hint for a solution ) > > Hi again, > > I've since this thread started tried this on more different systems, with > exactly the same results. Anyone else experiencing this? Anything I can do > to help fixing it? The following patch reduces the problem on A7V8X-E a little. It limits the tx queue to 1 packet and fixes handling of the timeout on txeof. The first part probably makes the second part a no-op. Without this, my A7V8X-E hangs on even light nfs activity (e.g., copying a 1MB file to nfs). With it, it takes heavier nfs activity to hang (makeworld never completes, and a flood ping always hangs). I first suspected an interrupt-related bug, but the bug seems to be more hardware-specific. Examination of the output queues shows that the tx sometimes just stops before processing all packets. Resetting in sk_watchdog() doesn't always fix the problem, and the timeout usually stops firing after a couple of unsuccessful resets, giving a completely hung device. But the problem may be related to interrupt timing, since it is much smaller under RELENG_4. RELENG_4 hangs about as often without this hack as -current does with it. nv0 hangs similarly. fxp0 just works. %%% Index: if_sk.c =================================================================== RCS file: /home/ncvs/src/sys/pci/if_sk.c,v retrieving revision 1.78 diff -u -2 -r1.78 if_sk.c --- if_sk.c 31 Mar 2004 12:35:51 -0000 1.78 +++ if_sk.c 1 Apr 2004 07:33:58 -0000 @@ -1830,4 +1830,9 @@ SK_IF_LOCK(sc_if); + if (sc_if->sk_cdata.sk_tx_cnt > 0) { + SK_IF_UNLOCK(sc_if); + return; + } + idx = sc_if->sk_cdata.sk_tx_prod; @@ -1853,4 +1858,5 @@ */ BPF_MTAP(ifp, m_head); + break; } @@ -2000,5 +2031,4 @@ sc_if->sk_cdata.sk_tx_cnt--; SK_INC(idx, SK_TX_RING_CNT); - ifp->if_timer = 0; } @@ -2007,4 +2037,6 @@ if (cur_tx != NULL) ifp->if_flags &= ~IFF_OACTIVE; + + ifp->if_timer = (sc_if->sk_cdata.sk_tx_cnt == 0) ? 0 : 5; return; %%% Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040407235838.K11719>