Date: Thu, 14 May 2009 16:01:33 -0400 From: Alexander Sack <pisymbol@gmail.com> To: freebsd-current@freebsd.org Subject: Re: Broadcom bge(4) panics while shutting down Message-ID: <3c0b01820905141301h1b08fc0ay1e6a1676b5a149d4@mail.gmail.com> In-Reply-To: <4A0C7544.6010304@delphij.net> References: <3c0b01820905141202w113966dp4bfbab73d84d585@mail.gmail.com> <4A0C7544.6010304@delphij.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, May 14, 2009 at 3:47 PM, Xin LI <delphij@delphij.net> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, Alexander, > > Alexander Sack wrote: >> Hello: >> >> Under heavy traffic (100% utilization GIGE on a 2 port BGE card) >> running BGE CURRENT driver I see panics on shutdown. =A0The reason is >> because bge_rxeof() while processing its RX ring of BD's drops the >> softc lock when it hands it off to its input function. =A0If bge_stop() >> is waiting for it, it will then proceed to acquire lock and then >> quiesce the hardware (reseting the card, clearing out BDs etc.). =A0Once >> bge_stop() releases the softc lock, then bge_rxeof() under an >> interrupt context (no polling here) will reacquire and continue to >> process the ring which is a bad idea. =A0It should check to see if the >> card is still running before continuing processing BDs (i.e. once >> IF_DRV_RUNNING has been reset by bge_stop(), bge_rxeof() is done, bail >> out). >> >> Here is my first go around with this patch: >> >> >> -- if_bge.c.CURRENT =A0 2009-05-14 14:39:39.000000000 -0400 >> +++ if_bge.c =A02009-05-14 14:39:24.000000000 -0400 >> @@ -3081,6 +3081,10 @@ >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 uint16_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vlan= _tag =3D 0; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = have_tag =3D 0; >> >> + =A0 =A0 =A0 =A0 =A0 =A0 if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) { >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; >> + =A0 =A0 =A0 =A0 =A0 =A0 } >> + >> =A0#ifdef DEVICE_POLLING >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (ifp->if_capenable & IFCAP_POLLING) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (sc->rxcycles <=3D 0) >> >> >> This prevents any panics during shutdown under heavy load and AS IT >> TURNS out (I feel stupid for not looking) that em(4) already had this >> check in its em_rxeof() function (right at the top of the loop). =A0I'm >> more than happy changing it to the em style but above seems reasonable >> to me though I have to verify there isn't anything missing off the >> loop from a hardware standpoint (I don't think so because bge_stop() >> did all the dirty work so I believe touching any registers after that >> from bge_rxeof() is a bad idea). >> >> Preliminary testing shows no more panics start and stopping ports >> under heavy load (panics were almost immediate otherwise). >> >> Thoughts? > > I think this would solve the problem but I'm not sure whether this would > increase some overhead on the RX path. =A0It seems that there is a race > between bge_release_resources() and bge_intr(), I mean, it might be a > good idea to "drain" bge_intr() instead? Are you talking about detach time? Because bge_stop() gets called before bge_release_resources() and stops host interrupts so where is the race again? I mean at this point no more interrupts should be delivered to bge_intr() (I can confirm from spec since BGE has released it in the wild). So why would you "drain" it at this point....(the hardware is down including the firmware). I agree it adds a little overhead to the standard bge_rxeof() path which I agree is very sensitive to change. However, I think the check at top is tolerable since the other recourse is crash. I mean its very easy to reproduce. Flood a Broadcom card with traffic then stop the card and let the race begin...it will go down in bge_rxeof() after bge_stop releases the lock. I actually did not look at changing anything structurally to perhaps make this whole predicament better but minimally there should be a shield against this no? -aps
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c0b01820905141301h1b08fc0ay1e6a1676b5a149d4>