From owner-freebsd-stable@FreeBSD.ORG Wed Sep 27 12:43:04 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 21A3216A403 for ; Wed, 27 Sep 2006 12:43:04 +0000 (UTC) (envelope-from Philippe.Pegon@crc.u-strasbg.fr) Received: from mailhost.u-strasbg.fr (mailhost.u-strasbg.fr [130.79.200.158]) by mx1.FreeBSD.org (Postfix) with ESMTP id 11F4943D76 for ; Wed, 27 Sep 2006 12:42:56 +0000 (GMT) (envelope-from Philippe.Pegon@crc.u-strasbg.fr) Received: from [IPv6:2001:660:2402:1001:20e:cff:fe60:e734] (apophis.u-strasbg.fr [IPv6:2001:660:2402:1001:20e:cff:fe60:e734]) by mailhost.u-strasbg.fr (8.13.6/jtpda-5.5pre1) with ESMTP id k8RCgHbo006592 ; Wed, 27 Sep 2006 14:42:17 +0200 (CEST) Message-ID: <451A71B6.6040201@crc.u-strasbg.fr> Date: Wed, 27 Sep 2006 14:42:30 +0200 From: Philippe Pegon User-Agent: Thunderbird 1.5.0.7 (X11/20060916) MIME-Version: 1.0 To: "Patrick M. Hausen" References: <451A1375.5080202@gneto.com> <20060927071538.GF22229@e-Gitt.NET> <451A4189.5020906@samsco.org> <20060927094509.GB75104@hugo10.ka.punkt.de> In-Reply-To: <20060927094509.GB75104@hugo10.ka.punkt.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.0.2 (mailhost.u-strasbg.fr [IPv6:2001:660:2402::158]); Wed, 27 Sep 2006 14:42:18 +0200 (CEST) X-Virus-Scanned: ClamAV 0.88.4/1947/Wed Sep 27 02:46:56 2006 on mr8.u-strasbg.fr X-Virus-Status: Clean X-Spam-Status: No, score=0.1 required=5.0 tests=AWL,NO_RELAYS autolearn=disabled version=3.1.4 X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-25) on mr8.u-strasbg.fr Cc: freebsd-stable@freebsd.org, Oliver Brandmueller Subject: Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Sep 2006 12:43:04 -0000 Hi, it's just a me too. On our ftp server (ftp8.fr.freebsd.org), sometimes we see some "watchdog timeout" in the log with a bge card, but maybe it's not the same problem... : /var/log/messages:Sep 23 02:47:06 anubis kernel: bge1: watchdog timeout -- resetting /var/log/messages:Sep 23 02:47:06 anubis kernel: bge1: link state changed to DOWN /var/log/messages:Sep 23 02:47:11 anubis kernel: bge1: link state changed to UP /var/log/messages.0.bz2:Sep 12 22:22:48 anubis kernel: bge1: watchdog timeout -- resetting /var/log/messages.0.bz2:Sep 12 22:22:48 anubis kernel: bge1: link state changed to DOWN /var/log/messages.0.bz2:Sep 12 22:22:51 anubis kernel: bge1: link state changed to UP /var/log/messages.0.bz2:Sep 17 15:22:01 anubis kernel: bge1: watchdog timeout -- resetting /var/log/messages.0.bz2:Sep 17 15:22:01 anubis kernel: bge1: link state changed to DOWN /var/log/messages.0.bz2:Sep 17 15:22:06 anubis kernel: bge1: link state changed to UP /var/log/messages.0.bz2:Sep 20 12:13:07 anubis kernel: bge1: watchdog timeout -- resetting /var/log/messages.0.bz2:Sep 20 12:13:07 anubis kernel: bge1: link state changed to DOWN /var/log/messages.0.bz2:Sep 20 12:13:11 anubis kernel: bge1: link state changed to UP /var/log/messages.1.bz2:Sep 6 08:33:54 anubis kernel: bge1: watchdog timeout -- resetting /var/log/messages.1.bz2:Sep 6 08:33:54 anubis kernel: bge1: link state changed to DOWN /var/log/messages.1.bz2:Sep 6 08:33:59 anubis kernel: bge1: link state changed to UP /var/log/messages.2.bz2:Sep 4 17:39:25 anubis kernel: bge1: link state changed to DOWN /var/log/messages.2.bz2:Sep 4 17:39:28 anubis kernel: bge1: link state changed to UP /var/log/messages.3.bz2:Aug 29 12:09:36 anubis kernel: bge0: watchdog timeout -- resetting /var/log/messages.3.bz2:Aug 29 12:09:36 anubis kernel: bge0: link state changed to DOWN /var/log/messages.3.bz2:Aug 29 12:09:41 anubis kernel: bge0: link state changed to UP /var/log/messages.4.bz2:Aug 22 15:44:00 anubis kernel: bge0: watchdog timeout -- resetting /var/log/messages.4.bz2:Aug 22 15:44:00 anubis kernel: bge0: link state changed to DOWN /var/log/messages.4.bz2:Aug 22 15:44:03 anubis kernel: bge0: link state changed to UP -- Philippe Pegon Patrick M. Hausen wrote: > Hello! > >> Well, the best I can say at the moment is, "Wow." =-( I guess the >> thing to do here is to figure out if the problem lies with the em >> interrupt handler not getting run, or the taskqueue not getting run. > > I helped Pyun with some debugging by providing ssh access to > a machine showing the (seemingly) same problem. > > At first he thought the interrupt handler of the em driver was > the culprit, but we applied quite a few patches and tested > afterwards - seems like the driver is not the cause. > > On -stable occasionally other people complained about very similar > looking problems with bge and other drivers. My guess is, though > I'm not a kernel developer, just an experienced admin, that > em stands out as problematic just by coincidence. Certain onboard > network components tend to come with certaiin chipsets and certain > architectures. > > So, Pyun suggested it was a problem with the taskqueue that was > introduced some time between 6.0 and 6.1. > > With my system (Tyan GT20 B5161G20) the problem shows when there > is heavy disk and cpu activity, like "make buildworld". > I made sure that the em interface doesn't share an interrupt > with the SATA controller. When the problem occurs, I get the > well known "watchdog timeout" messages and then the system's > network activity over that interface freezes completely for > a couple of minutes. > Usually the system recovers after a while without reboot or > other measures. > > What I can do: give ssh access to a system showing this behaviour > including a network connection to another box, so one can transfer > large amounts of data over a private LAN. I used FTP of a sparse > big file. > > Prerequisite: fixed IP address of the machine that the developer > whishes to use to connect to my system. > > HTH, > Patrick