From owner-freebsd-stable@FreeBSD.ORG Fri Nov 3 16:34:14 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F21CF16A416 for ; Fri, 3 Nov 2006 16:34:14 +0000 (UTC) (envelope-from greg@warprecords.com) Received: from mail9.messagelabs.com (mail9.messagelabs.com [194.205.110.133]) by mx1.FreeBSD.org (Postfix) with SMTP id 9090343D5A for ; Fri, 3 Nov 2006 16:34:12 +0000 (GMT) (envelope-from greg@warprecords.com) X-VirusChecked: Checked X-Env-Sender: greg@warprecords.com X-Msg-Ref: server-3.tower-9.messagelabs.com!1162571650!20834958!1 X-StarScan-Version: 5.5.10.7; banners=-,-,- X-Originating-IP: [212.135.210.82] Received: (qmail 25325 invoked from network); 3 Nov 2006 16:34:10 -0000 Received: from dsl-212-135-210-82.dsl.easynet.co.uk (HELO warprecords.com) (212.135.210.82) by server-3.tower-9.messagelabs.com with SMTP; 3 Nov 2006 16:34:10 -0000 Received: from [194.106.52.52] (HELO [192.168.0.2]) by warprecords.com (CommuniGate Pro SMTP 5.0.10) with ESMTPS id 6945200 for freebsd-stable@freebsd.org; Fri, 03 Nov 2006 16:34:10 +0000 Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <9F7B653A50CF3D45A92C05401046239B0E0CBD@rwsrv06.rw2.riverwillow.net.au> References: <9F7B653A50CF3D45A92C05401046239B0E0CBD@rwsrv06.rw2.riverwillow.net.au> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Greg Eden Date: Fri, 3 Nov 2006 16:34:08 +0000 To: freebsd-stable@freebsd.org X-Mailer: Apple Mail (2.752.2) Subject: Re: Watchdog Timeout - bge device - 6.2-PRERELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Nov 2006 16:34:15 -0000 On 2 Nov 2006, at 23:50, John Marshall wrote: > bge0: watchdog timeout -- resetting > bge0: link state changed to DOWN > bge0: link state changed to UP I'm seeing similar behaviour on a HP DL360g4 running 6.1-RELEASE and a GENERIC kernel. It had similar problems with 5.4. I have 2 other similar machines running 6.1, which don't record these errors, however they never see any sustained throughput whereas this machine does. bge0: mem 0xfde70000-0xfde7ffff irq 25 at device 2.0 on pci2 IRQ is not shared vmstat -i interrupt total rate irq1: atkbd0 1668 0 irq6: fdc0 87 0 irq14: ata0 46 0 irq16: uhci0 49865767 5 irq24: ciss0 6645080 0 irq25: bge0 336582162 34 irq26: bge1 313542372 32 irq48: mpt0 49865839 5 cpu0: timer 2055753320 210 Total 2812256341 287 > Scott Long said > Is it causing stuck connections or other messy problems? Also, is it > any worse than 6.1? Fortunately all it seems to be doing is bothering my log files. I'm taking an interest as I have two important production machines using the bge driver both about to be upgraded from 5.3R to 6.1R. I've just grepped the logs on those machines and they both have a sprinkling of timeouts (5 over 18 months on one quite heavily trafficked webserver, the other which continously serves http downloads is clean apart from 1 24hour period when it was having 200GB of files uploaded to it). My concern is that upgrading to 6.1 or 6.2 and then enabling device polling is going to make the issue worse. I'm happy to supply more info. best. greg.