Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Aug 2006 13:07:43 +0100
From:      Sam Eaton <sam@fqdn.net>
To:        John Baldwin <jhb@freebsd.org>
Cc:        David Christensen <davidch@broadcom.com>, freebsd-current@freebsd.org
Subject:   Re: [sam@fqdn.net: bce0 watchdog timeout errors]
Message-ID:  <20060830120743.GQ60234@host.fqdn.net>
In-Reply-To: <200608291804.03848.jhb@freebsd.org>
References:  <09BFF2FA5EAB4A45B6655E151BBDD90301E2EF85@NT-IRVA-0750.brcm.ad.broadcom.com> <200608291804.03848.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 29, 2006 at 06:04:03PM -0400, John Baldwin wrote:
> On Tuesday 29 August 2006 17:33, David Christensen wrote:
> > > Thought it was worth offering another data point.  I'm 
> > > running the most
> > > recent version of the bce driver with the changes to fix the 'mbuf'
> > > errors.
> > > 
> > 
> > A change was recently added to bge (r1.140) to address some issues
> > with locking in the driver when performing PHY accesses which was 
> > also causing watchdog timeout errors.  I need to look at those
> > changes and see if they are applicable to the bce driver as well, 
> > though I've been having problems loading both bge and bce as
> > modules on -CURRENT (causes a panic).  If I can get past the module
> > problem I'll look at the bge change soon.
> 
> bce_ifmedia_sts() has locking, but bce_ifmedia_upd() is missing locking.
> Something like this would do it:
> 
> Index: if_bce.c
> ===================================================================
> RCS file: /host/cvs/usr/cvs/src/sys/dev/bce/if_bce.c,v
> retrieving revision 1.7
> diff -u -r1.7 if_bce.c
> --- if_bce.c	15 Aug 2006 04:56:29 -0000	1.7
> +++ if_bce.c	29 Aug 2006 22:03:17 -0000

I've patched my 6-STABLE box with your suggested change to if_bce.c
(modulo a change of the typo of physm to phys).

Doesn't seem to help my problem at all.  If I load the network up a bit
(some NFS activity seems to be the quickest way to trigger it), then I
start getting watchdog timeout errors again.  

It remains rather like Julian's earlier problem, in that if I kill off
the load-causing processes, then the network card recovers *a bit*, but
it's still not happy, and still giving watchdog timeouts.

I'm not sure if we have local switch issues making things worse, I'm
investigating that, but either way, the card shouldn't lock up like
this.

Thanks for your help so far, happy to get any other debug info required,

Sam.
-- 
"Fortified with Essential Bitterness and Sarcasm"
    Matt Groening, "Binky's Guide to Love".



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060830120743.GQ60234>