Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Apr 2010 12:29:27 +0300
From:      Mikolaj Golub <to.my.trociny@gmail.com>
To:        Mikolaj Golub <to.my.trociny@gmail.com>
Cc:        Brandon Gooch <jamesbrandongooch@gmail.com>, pyunyh@gmail.com, freebsd-stable@freebsd.org, Jack Vogel <jfvogel@gmail.com>, Brandon Gooch <bgooch@se.edu>
Subject:   Re: em driver regression
Message-ID:  <867hoajsaw.fsf@zhuzha.ua1>
In-Reply-To: <86d3y54tb0.fsf@kopusha.onet> (Mikolaj Golub's message of "Sun, 11 Apr 2010 23:40:03 %2B0300")
References:  <201004081313.o38DD4JM041821@lava.sentex.ca> <7.1.0.9.0.20100408091756.10652be0@sentex.net> <201004081446.o38EkU7h042296@lava.sentex.ca> <20100408181741.GI5734@michelle.cdnetworks.com> <g2q2a41acea1004081122ndec4e6a1mbac3f0f7fa11cc9c@mail.gmail.com> <s2h2a41acea1004081127qac1d542dufcefbf5ec054e0ce@mail.gmail.com> <20100408183900.GJ5734@michelle.cdnetworks.com> <q2k2a41acea1004081217t24b14f5fi18801af5ca0a962@mail.gmail.com> <r2w179b97fb1004081252k2c864293zb63bd44e200e7efe@mail.gmail.com> <86d3y54tb0.fsf@kopusha.onet>

next in thread | previous in thread | raw e-mail | index | archive | help
--=-=-=


On Sun, 11 Apr 2010 23:40:03 +0300 Mikolaj Golub wrote:

 MG> Hi,

 MG> Today I have upgraded the kernel in my VirtualBox (3.1.51.r27187) to the
 MG> latest current and have "em0: Watchdog timeout -- resetting" issue. My
 MG> previous kernel was for Mar 12.

 MG> Tracking the revision where the problem appeared I see that the issue is not
 MG> observed for r203834 and starts to observe after r205869.

 MG> Interestingly, if I enter ddb and then exit (sometimes I needed to do this
 MG> twice) the errors stop and network starts working.

Adding some prints I observed the following:

Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 813, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 818, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in lem_mq_start_locked 1 (ticks 818, watchdog_
time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 818, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 823, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in lem_mq_start_locked 1 (ticks 828, watchdog_
time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 923, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 923, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1023, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 1023, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks: 1023, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 1024, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 1028, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1128, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 1 (ticks: 1128, watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks: 1128, watchdog_time: 0)
...

So althogh adapter->watchdog_check was set TRUE, adapter->watchdog_time was
never set.

I see that before r205869 watchdog_time was set in em_xmit but lem_xmit does
not contain this. After adding back this line to lem_xmit (see the first patch
below) the problem has gone on my box.

Also seeing that in the current em_mq_start_locked() both watchdog_check and
watchdog_time are set I tried another patch adding watchdog_time setting in
lem_mq_start_locked() too (see the second patch below). This has also fixed
the issue for me but I don't know if this is a correct fix and if this is the
only place where watchdog_time should be set (there are other places in the
function and in the code where watchdog_check is set to TRUE but watchdog_time
is not set).

-- 
Mikolaj Golub


--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline; filename=if_lem.c.watchdog_time.1.patch

Index: sys/dev/e1000/if_lem.c
===================================================================
--- sys/dev/e1000/if_lem.c	(revision 206595)
+++ sys/dev/e1000/if_lem.c	(working copy)
@@ -1880,6 +1880,7 @@ lem_xmit(struct adapter *adapter, struct mbuf **m_
 	 */
 	tx_buffer = &adapter->tx_buffer_area[first];
 	tx_buffer->next_eop = last;
+	adapter->watchdog_time = ticks;
 
 	/*
 	 * Advance the Transmit Descriptor Tail (TDT), this tells the E1000

--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline; filename=if_lem.c.watchdog_time.2.patch

Index: sys/dev/e1000/if_lem.c
===================================================================
--- sys/dev/e1000/if_lem.c	(revision 206595)
+++ sys/dev/e1000/if_lem.c	(working copy)
@@ -873,6 +873,7 @@ lem_mq_start_locked(struct ifnet *ifp, struct mbuf
 			*/
 			ETHER_BPF_MTAP(ifp, m);
 			adapter->watchdog_check = TRUE;
+			adapter->watchdog_time = ticks;
 		}
 	} else if ((error = drbr_enqueue(ifp, adapter->br, m)) != 0)
 		return (error);

--=-=-=--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?867hoajsaw.fsf>