From owner-freebsd-stable@FreeBSD.ORG Wed Apr 14 09:29:33 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8785106566C for ; Wed, 14 Apr 2010 09:29:33 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f216.google.com (mail-bw0-f216.google.com [209.85.218.216]) by mx1.freebsd.org (Postfix) with ESMTP id 54E3A8FC1C for ; Wed, 14 Apr 2010 09:29:32 +0000 (UTC) Received: by bwz8 with SMTP id 8so5344222bwz.3 for ; Wed, 14 Apr 2010 02:29:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject :organization:references:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=l/F3d/fjlQANH0yUjmhBiDgGPnoW7GYA6wbNG4Ltt2Y=; b=pusoHyJPa34JnGNo1cqj+Oc+Q0EfKECKSK5Xl+fEuSdqN7PmFpQimVkpofmLqbAwsm wCAEiE2Ly8WQKFyFFwH7PAILmfmIcleXQVDkYOHqjGy9zzI1yLDE/2GCANVS2B/2SWPl cdF4/DISQZjD4J8jj49tUwFisuNvJMT5qe9eY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:organization:references:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=w9iawSalEQUgyisX9GFtaYNJ9BDi75qzlih12fKBRyaYCY7ILIrbAWH2zV222qgm6S qGLCSjDPPsMK0JgqBJQiE4rlFly8J3lbbSafbkXAghV9szPLmhQIXZDGpUYTCWwyDr0J VcKIvNDbXEMTcPbdbsGN/GuIT5TDRACxlnYPE= Received: by 10.204.139.68 with SMTP id d4mr8179909bku.66.1271237372103; Wed, 14 Apr 2010 02:29:32 -0700 (PDT) Received: from localhost (ua1.etadirect.net [91.198.140.16]) by mx.google.com with ESMTPS id 16sm340083bwz.9.2010.04.14.02.29.29 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 14 Apr 2010 02:29:30 -0700 (PDT) From: Mikolaj Golub To: Mikolaj Golub Organization: TOA Ukraine References: <201004081313.o38DD4JM041821@lava.sentex.ca> <7.1.0.9.0.20100408091756.10652be0@sentex.net> <201004081446.o38EkU7h042296@lava.sentex.ca> <20100408181741.GI5734@michelle.cdnetworks.com> <20100408183900.GJ5734@michelle.cdnetworks.com> <86d3y54tb0.fsf@kopusha.onet> Date: Wed, 14 Apr 2010 12:29:27 +0300 In-Reply-To: <86d3y54tb0.fsf@kopusha.onet> (Mikolaj Golub's message of "Sun, 11 Apr 2010 23:40:03 +0300") Message-ID: <867hoajsaw.fsf@zhuzha.ua1> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: Brandon Gooch , pyunyh@gmail.com, freebsd-stable@freebsd.org, Jack Vogel , Brandon Gooch Subject: Re: em driver regression X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Apr 2010 09:29:33 -0000 --=-=-= On Sun, 11 Apr 2010 23:40:03 +0300 Mikolaj Golub wrote: MG> Hi, MG> Today I have upgraded the kernel in my VirtualBox (3.1.51.r27187) to the MG> latest current and have "em0: Watchdog timeout -- resetting" issue. My MG> previous kernel was for Mar 12. MG> Tracking the revision where the problem appeared I see that the issue is not MG> observed for r203834 and starts to observe after r205869. MG> Interestingly, if I enter ddb and then exit (sometimes I needed to do this MG> twice) the errors stop and network starts working. Adding some prints I observed the following: Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 813, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 818, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in lem_mq_start_locked 1 (ticks 818, watchdog_ time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 818, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 823, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in lem_mq_start_locked 1 (ticks 828, watchdog_ time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 923, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 923, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1023, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 1023, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks: 1023, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 1024, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 1028, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1128, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 1 (ticks: 1128, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks: 1128, watchdog_time: 0) ... So althogh adapter->watchdog_check was set TRUE, adapter->watchdog_time was never set. I see that before r205869 watchdog_time was set in em_xmit but lem_xmit does not contain this. After adding back this line to lem_xmit (see the first patch below) the problem has gone on my box. Also seeing that in the current em_mq_start_locked() both watchdog_check and watchdog_time are set I tried another patch adding watchdog_time setting in lem_mq_start_locked() too (see the second patch below). This has also fixed the issue for me but I don't know if this is a correct fix and if this is the only place where watchdog_time should be set (there are other places in the function and in the code where watchdog_check is set to TRUE but watchdog_time is not set). -- Mikolaj Golub --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=if_lem.c.watchdog_time.1.patch Index: sys/dev/e1000/if_lem.c =================================================================== --- sys/dev/e1000/if_lem.c (revision 206595) +++ sys/dev/e1000/if_lem.c (working copy) @@ -1880,6 +1880,7 @@ lem_xmit(struct adapter *adapter, struct mbuf **m_ */ tx_buffer = &adapter->tx_buffer_area[first]; tx_buffer->next_eop = last; + adapter->watchdog_time = ticks; /* * Advance the Transmit Descriptor Tail (TDT), this tells the E1000 --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=if_lem.c.watchdog_time.2.patch Index: sys/dev/e1000/if_lem.c =================================================================== --- sys/dev/e1000/if_lem.c (revision 206595) +++ sys/dev/e1000/if_lem.c (working copy) @@ -873,6 +873,7 @@ lem_mq_start_locked(struct ifnet *ifp, struct mbuf */ ETHER_BPF_MTAP(ifp, m); adapter->watchdog_check = TRUE; + adapter->watchdog_time = ticks; } } else if ((error = drbr_enqueue(ifp, adapter->br, m)) != 0) return (error); --=-=-=--