From owner-freebsd-current@FreeBSD.ORG Wed Aug 30 12:07:48 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A0ADD16A4E1; Wed, 30 Aug 2006 12:07:48 +0000 (UTC) (envelope-from sam@fqdn.net) Received: from host.fqdn.net (host.fqdn.net [194.242.157.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4612743D5D; Wed, 30 Aug 2006 12:07:45 +0000 (GMT) (envelope-from sam@fqdn.net) Received: by host.fqdn.net (Postfix, from userid 1003) id 6C628241; Wed, 30 Aug 2006 13:07:43 +0100 (BST) Date: Wed, 30 Aug 2006 13:07:43 +0100 From: Sam Eaton To: John Baldwin Message-ID: <20060830120743.GQ60234@host.fqdn.net> Mail-Followup-To: John Baldwin , freebsd-current@freebsd.org, David Christensen References: <09BFF2FA5EAB4A45B6655E151BBDD90301E2EF85@NT-IRVA-0750.brcm.ad.broadcom.com> <200608291804.03848.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200608291804.03848.jhb@freebsd.org> User-Agent: Mutt/1.4.2.1i Cc: David Christensen , freebsd-current@freebsd.org Subject: Re: [sam@fqdn.net: bce0 watchdog timeout errors] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Aug 2006 12:07:48 -0000 On Tue, Aug 29, 2006 at 06:04:03PM -0400, John Baldwin wrote: > On Tuesday 29 August 2006 17:33, David Christensen wrote: > > > Thought it was worth offering another data point. I'm > > > running the most > > > recent version of the bce driver with the changes to fix the 'mbuf' > > > errors. > > > > > > > A change was recently added to bge (r1.140) to address some issues > > with locking in the driver when performing PHY accesses which was > > also causing watchdog timeout errors. I need to look at those > > changes and see if they are applicable to the bce driver as well, > > though I've been having problems loading both bge and bce as > > modules on -CURRENT (causes a panic). If I can get past the module > > problem I'll look at the bge change soon. > > bce_ifmedia_sts() has locking, but bce_ifmedia_upd() is missing locking. > Something like this would do it: > > Index: if_bce.c > =================================================================== > RCS file: /host/cvs/usr/cvs/src/sys/dev/bce/if_bce.c,v > retrieving revision 1.7 > diff -u -r1.7 if_bce.c > --- if_bce.c 15 Aug 2006 04:56:29 -0000 1.7 > +++ if_bce.c 29 Aug 2006 22:03:17 -0000 I've patched my 6-STABLE box with your suggested change to if_bce.c (modulo a change of the typo of physm to phys). Doesn't seem to help my problem at all. If I load the network up a bit (some NFS activity seems to be the quickest way to trigger it), then I start getting watchdog timeout errors again. It remains rather like Julian's earlier problem, in that if I kill off the load-causing processes, then the network card recovers *a bit*, but it's still not happy, and still giving watchdog timeouts. I'm not sure if we have local switch issues making things worse, I'm investigating that, but either way, the card shouldn't lock up like this. Thanks for your help so far, happy to get any other debug info required, Sam. -- "Fortified with Essential Bitterness and Sarcasm" Matt Groening, "Binky's Guide to Love".