From owner-freebsd-stable@FreeBSD.ORG Mon Sep 11 00:22:52 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 411B816A412 for ; Mon, 11 Sep 2006 00:22:52 +0000 (UTC) (envelope-from amon@sockar.homeip.net) Received: from sockar.homeip.net (tourist.net8.nerim.net [213.41.176.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id B707243D45 for ; Mon, 11 Sep 2006 00:22:51 +0000 (GMT) (envelope-from amon@sockar.homeip.net) Received: from sockar.homeip.net (localhost [127.0.0.1]) by sockar.homeip.net (8.13.4/8.13.3) with ESMTP id k8B0HN6v063767 for ; Mon, 11 Sep 2006 02:17:23 +0200 (CEST) (envelope-from amon@sockar.homeip.net) Received: (from amon@localhost) by sockar.homeip.net (8.13.4/8.13.3/Submit) id k8B0HMHf063766 for freebsd-stable@freebsd.org; Mon, 11 Sep 2006 02:17:22 +0200 (CEST) (envelope-from amon) Date: Mon, 11 Sep 2006 02:17:22 +0200 From: Herve Boulouis To: freebsd-stable@freebsd.org Message-ID: <20060911001722.GR611@ra.aabs> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i Subject: bge watchdog timeouts still happening X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Sep 2006 00:22:52 -0000 Hi, I've recently put into production 2 web servers with 6.0-STABLE from mid january and was bitten by the bge watchdog timeouts problems. I cvsupped the 2 boxes with the latest -stable (latest if_bge.c, rev 1.91.2.17) but the problem still persists :( Server hardware is Dell poweredge 2550 with SMP kernel. Relevant portion of dmesg : bge0: mem 0xfeb00000-0xfeb0ffff irq 17 at device 8.0 on pci1 miibus0: on bge0 brgphy0: on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:06:5b:1a:7f:4a On the first box, the load is quite light so the problem as not yet re-appaeared since the upgrade. On the 2d box, which usually outputs 10-15 Mbit/s, the timeouts came back very shortly after the ugprade. Extract of logs from the 2d box : (uptime < 1h) Sep 11 01:19:50 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:19:50 www1 kernel: bge0: link state changed to DOWN Sep 11 01:19:54 www1 kernel: bge0: link state changed to UP Sep 11 01:26:10 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:26:10 www1 kernel: bge0: link state changed to DOWN Sep 11 01:26:13 www1 kernel: bge0: link state changed to UP Sep 11 01:27:32 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:27:32 www1 kernel: bge0: link state changed to DOWN Sep 11 01:27:35 www1 kernel: bge0: link state changed to UP Sep 11 01:28:52 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:28:52 www1 kernel: bge0: link state changed to DOWN Sep 11 01:28:55 www1 kernel: bge0: link state changed to UP Sep 11 01:31:12 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:31:12 www1 kernel: bge0: link state changed to DOWN Sep 11 01:31:15 www1 kernel: bge0: link state changed to UP Sep 11 01:33:57 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:33:57 www1 kernel: bge0: link state changed to DOWN Sep 11 01:34:00 www1 kernel: bge0: link state changed to UP Sep 11 01:34:16 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:34:16 www1 kernel: bge0: link state changed to DOWN Sep 11 01:34:19 www1 kernel: bge0: link state changed to UP Sep 11 01:34:41 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:34:41 www1 kernel: bge0: link state changed to DOWN Sep 11 01:34:44 www1 kernel: bge0: link state changed to UP Sep 11 01:35:06 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:35:06 www1 kernel: bge0: link state changed to DOWN Sep 11 01:35:09 www1 kernel: bge0: link state changed to UP Sep 11 01:36:17 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:36:17 www1 kernel: bge0: link state changed to DOWN Sep 11 01:36:20 www1 kernel: bge0: link state changed to UP Sep 11 01:37:47 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:37:47 www1 kernel: bge0: link state changed to DOWN Sep 11 01:37:50 www1 kernel: bge0: link state changed to UP Sep 11 01:38:53 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:38:53 www1 kernel: bge0: link state changed to DOWN Sep 11 01:38:56 www1 kernel: bge0: link state changed to UP Sep 11 01:39:56 www1 kernel: bge0: watchdog timeout -- resetting Sep 11 01:39:56 www1 kernel: bge0: link state changed to DOWN Sep 11 01:39:59 www1 kernel: bge0: link state changed to UP I've removed 'options SMP' from the kernel config of the loaded box but the timeouts continue to happen. What can I do to help resolve this bug ? -- Herve Boulouis