From owner-freebsd-net@FreeBSD.ORG Mon Apr 24 12:47:40 2006 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8B65216A400 for ; Mon, 24 Apr 2006 12:47:40 +0000 (UTC) (envelope-from oleg@lath.rinet.ru) Received: from lath.rinet.ru (lath.rinet.ru [195.54.192.90]) by mx1.FreeBSD.org (Postfix) with ESMTP id 362F043D46 for ; Mon, 24 Apr 2006 12:47:38 +0000 (GMT) (envelope-from oleg@lath.rinet.ru) Received: from lath.rinet.ru (localhost [127.0.0.1]) by lath.rinet.ru (8.13.4/8.13.4) with ESMTP id k3OClbsS072749 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 24 Apr 2006 16:47:37 +0400 (MSD) (envelope-from oleg@lath.rinet.ru) Received: (from oleg@localhost) by lath.rinet.ru (8.13.4/8.13.4/Submit) id k3OClaN0072748; Mon, 24 Apr 2006 16:47:36 +0400 (MSD) (envelope-from oleg) Date: Mon, 24 Apr 2006 16:47:36 +0400 From: Oleg Bulyzhin To: Lars Erik Gullerud Message-ID: <20060424124736.GA72623@lath.rinet.ru> References: <20060423114810.P36951@electra.nolink.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="K8nIJk4ghYZn606h" Content-Disposition: inline In-Reply-To: <20060423114810.P36951@electra.nolink.net> User-Agent: Mutt/1.5.11 Cc: freebsd-net@freebsd.org Subject: Re: Watchdog timeouts and dead network on bge - 6.1-RC1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Apr 2006 12:47:40 -0000 --K8nIJk4ghYZn606h Content-Type: multipart/mixed; boundary="17pEHd4RhPHOinZp" Content-Disposition: inline --17pEHd4RhPHOinZp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Apr 23, 2006 at 02:35:24PM +0200, Lars Erik Gullerud wrote: > We recently upgraded one of our 4.11 servers to 6.1-RC1. The server is a= =20 > Dell PE2650, dual Xeons, and has two onboard Broadcom BCM5701 cards, usin= g=20 > the bge driver. >=20 > Some older threads on -net and -current led me to believe that most issue= s=20 > with bge driver in FreeBSD >4 had been sorted. However, after our upgrade= ,=20 > we are seing errors like this: >=20 > Apr 22 18:44:01 nebula kernel: bge0: watchdog timeout -- resetting > Apr 22 18:44:01 nebula kernel: bge0: link state changed to DOWN > Apr 22 18:44:03 nebula kernel: bge0: link state changed to UP >=20 > ...and more importantly - when this happens, the network connection does= =20 > NOT in fact come back up. Logging into the box locally (or via a differen= t=20 > network interface) and manually issuing "ifconfig bge0 down ; ifconfig=20 > bge0 up" DOES get the interface going again, however. >=20 > We have only seen this on very high network loads - the particular messag= e=20 > included above occured while transferring some 120GB of data from a 4.11= =20 > NFS-server to this 6.1-RC1 box. >=20 > Is this a known issue in bge? If so, is anyone working on it? Can we=20 > provide some useful information to whoever this might be? >=20 > We have never had any issues with bge in 4.x, but we really need to get= =20 > this server up to 5.x/6.x at this point in time, any other suggestions on= =20 > knobs or workarounds that can give us bge stability? >=20 > Thanks in advance, >=20 > /leg > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" Could you try attached patch? It should fix problem when link goes UP but network is still down. About bge resets: you should try if_bge.c rev.1.126, it may help. P.S. anyway, please report how is it going. --=20 Oleg. --17pEHd4RhPHOinZp Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="bge_init_intr.diff" Content-Transfer-Encoding: quoted-printable Index: if_bge.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/dev/bge/if_bge.c,v retrieving revision 1.91.2.13 diff -u -r1.91.2.13 if_bge.c --- if_bge.c 4 Mar 2006 09:34:48 -0000 1.91.2.13 +++ if_bge.c 17 Apr 2006 19:39:15 -0000 @@ -3308,6 +3308,14 @@ =09 bge_ifmedia_upd(ifp); =20 + sc->bge_link_evt++; +#ifdef DEVICE_POLLING + if (!(sc->bge_ifp->if_capenable & IFCAP_POLLING)) +#endif + { + BGE_SETBIT(sc, BGE_MISC_LOCAL_CTL, BGE_MLC_INTR_SET); + } + ifp->if_drv_flags |=3D IFF_DRV_RUNNING; ifp->if_drv_flags &=3D ~IFF_DRV_OACTIVE; =20 --17pEHd4RhPHOinZp-- --K8nIJk4ghYZn606h Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFETMjoryLc73jOEF8RAtiKAJ9pIMnO2PurXk2R56rXPQiHPV7TLgCeLrnQ cGozrySFhNa1bLL1J5sD2/Q= =Wr6Y -----END PGP SIGNATURE----- --K8nIJk4ghYZn606h--