From owner-freebsd-stable@FreeBSD.ORG Tue Aug 29 16:52:38 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 62C8C16A4DD for ; Tue, 29 Aug 2006 16:52:38 +0000 (UTC) (envelope-from sam@fqdn.net) Received: from host.fqdn.net (host.fqdn.net [194.242.157.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 121C943D4C for ; Tue, 29 Aug 2006 16:52:37 +0000 (GMT) (envelope-from sam@fqdn.net) Received: by host.fqdn.net (Postfix, from userid 1003) id 5E2F520C; Tue, 29 Aug 2006 17:52:34 +0100 (BST) Date: Tue, 29 Aug 2006 17:52:34 +0100 From: Sam Eaton To: freebsd-stable@freebsd.org Message-ID: <20060829165234.GA15988@host.fqdn.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i Subject: bce0 watchdog timeout errors X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Aug 2006 16:52:38 -0000 I'm still seeing an ongoing problem with the bce device on my Dell 1950. I'm running AMD64 6-STABLE, with the stock SMP kernel, and I'm running the most recent version of the bce driver, which did cure the other errors we were seeing (the mbuf related ones). The card is currently connected at an auto-negotiated 100BaseTX full duplex (rather than gigabit) as we don't currently have a gigabit switch to test on (the machine is under test rather than deployed). I can consistently cause the system to go into a 'Watchdog timeout occurred, resetting!' loop, by trying to do any reasonable amount of work over an nfs mounted filesystem. An easy way to reproduce this for me is to try and build some reasonably large port on our nfs mounted copy of the ports tree. I can also cause this by running bonnie++ against an nfs mounted filesystem. I've so far failed to find some simpler network only test to trigger the problem (I've tried sshing large amounts of data back and forth, iperf, ping floods, etc). NFS seems to do the trick every time though. Once it's reported the watchdog timeout, the networking on the box never recovers. Is anyone else seeing anything similar? And does anyone have any suggestions as to what I can do to try and diagnose this further so we can get to the bottom of it? Thanks, Sam. -- "Fortified with Essential Bitterness and Sarcasm" Matt Groening, "Binky's Guide to Love".