Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Nov 2005 16:22:28 +0300
From:      Gleb Smirnoff <glebius@FreeBSD.org>
To:        Emil Mikulic <emil@cs.rmit.edu.au>
Cc:        freebsd-current@FreeBSD.org
Subject:   Re: bge driver autoneg failure and system-wide stalls
Message-ID:  <20051125132228.GN25711@cell.sick.ru>
In-Reply-To: <20051125022040.GA9150@cs.rmit.edu.au>
References:  <20051125022040.GA9150@cs.rmit.edu.au>

next in thread | previous in thread | raw e-mail | index | archive | help

--MAH+hnPXVZWQ5cD/
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline

On Fri, Nov 25, 2005 at 01:20:41PM +1100, Emil Mikulic wrote:
E> I have a network port with bad wiring in the walls - a cable tester
E> shows only wires 1,2,3 and 6 are actually connected.
E> 
E> My solution is to patch directly into the switch, in which case the bge
E> driver works just fine.  However, the bad wiring exposes two problems
E> with the bge driver in 7-CURRENT.  From memory, I think these turned up
E> in the 5.x line because I wasn't seeing either issue in 4.x
E> 
E> The first problem is that, once ifconfig'd to an IP address, there will
E> be periodic system-wide stalls.  They generally last a little under a
E> second and are incredibly annoying and can cause keypresses to be lost
E> at the console.
E> 
E> I instrumented the kernel and, as far as I can tell, once ifconfig'd,
E> the following will happen in brgphy (mii module):
E> 
E> Every second there is a call to brgphy_service() with cmd=MII_TICK.
E> Every five seconds, this function will call brgphy_mii_phy_auto().
E> This function calls brgphy_loop().
E> 
E> In brgphy_loop(), there is a #if 0'd bit of code that device_printf()'s
E> how many times it looped.  I enabled it.
E> 
E> Sometimes it reports zero loops - when this happens there is no stall.
E> On a very pronounced stall, there will be between 3000-7000 loops.
E> 
E> (i.e. the stalls appear a bit random because they only get a chance to
E> happen once every five seconds, and sometimes brgphy_loop() doesn't
E> result in a noticeable stall)
E> 
E> The other problem is that bge will never negotiate a working link speed.
E> ifconfig will always return "status: no carrier"
E> 
E> If I force the media to 10baseT/UTP or 100baseTX (either mediaopt
E> full-duplex or not), it will issue a couple more MII_TICKs then stop,
E> ifconfig will return "status: active", there will be no more stalls,
E> and, most importantly, the network connection will actually work.
E> 
E> Is this fixable and actually worth fixing?

Please try out the attached patch.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE

--MAH+hnPXVZWQ5cD/
Content-Type: text/plain; charset=koi8-r
Content-Disposition: attachment; filename="bge_link.patch"

Index: sys/dev/bge/if_bge.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/bge/if_bge.c,v
retrieving revision 1.91.2.4
diff -u -r1.91.2.4 if_bge.c
--- sys/dev/bge/if_bge.c	9 Oct 2005 04:15:11 -0000	1.91.2.4
+++ sys/dev/bge/if_bge.c	22 Oct 2005 08:36:05 -0000
@@ -3026,24 +3094,20 @@
 	struct bge_softc *sc;
 {
 	struct mii_data *mii = NULL;
-	struct ifmedia *ifm = NULL;
 	struct ifnet *ifp;
 
-	ifp = sc->bge_ifp;
-
 	BGE_LOCK_ASSERT(sc);
 
+	ifp = sc->bge_ifp;
+
 	if (sc->bge_asicrev == BGE_ASICREV_BCM5705 ||
 	    sc->bge_asicrev == BGE_ASICREV_BCM5750)
 		bge_stats_update_regs(sc);
 	else
 		bge_stats_update(sc);
 	callout_reset(&sc->bge_stat_ch, hz, bge_tick, sc);
-	if (sc->bge_link)
-		return;
 
 	if (sc->bge_tbi) {
-		ifm = &sc->bge_ifmedia;
 		if (CSR_READ_4(sc, BGE_MAC_STS) &
 		    BGE_MACSTAT_TBI_PCS_SYNCHED) {
 			sc->bge_link++;
@@ -3073,8 +3137,6 @@
 		if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
 			bge_start_locked(ifp);
 	}
-
-	return;
 }
 
 static void

--MAH+hnPXVZWQ5cD/--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051125132228.GN25711>