Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Aug 2006 10:22:45 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-current@freebsd.org, pyunyh@gmail.com
Cc:        Oleg Bulyzhin <oleg@freebsd.org>
Subject:   Re: call for bge(4) testers
Message-ID:  <200608251022.46120.jhb@freebsd.org>
In-Reply-To: <20060824010746.GC22634@cdnetworks.co.kr>
References:  <20060822042023.GC12848@cdnetworks.co.kr> <20060824004354.GC25876@lath.rinet.ru> <20060824010746.GC22634@cdnetworks.co.kr>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 23 August 2006 21:07, Pyun YongHyeon wrote:
> On Thu, Aug 24, 2006 at 04:43:54AM +0400, Oleg Bulyzhin wrote:
>  > On Thu, Aug 24, 2006 at 09:30:35AM +0900, Pyun YongHyeon wrote:
>  > > On Wed, Aug 23, 2006 at 04:54:34PM +0400, Oleg Bulyzhin wrote:
>  > >  > On Wed, Aug 23, 2006 at 04:40:35PM +0400, Oleg Bulyzhin wrote:
>  > >  > > On Wed, Aug 23, 2006 at 09:55:54AM +0900, Pyun YongHyeon wrote:
>  > >  > > > On Wed, Aug 23, 2006 at 12:43:42AM +0400, Oleg Bulyzhin wrote:
>  > >  > > >  > On Tue, Aug 22, 2006 at 02:44:34PM +0200, Michael Reifenberger wrote:
>  > >  > > >  > > On Tue, 22 Aug 2006, Pyun YongHyeon wrote:
>  > >  > > >  > > ...
>  > >  > > >  > > >I'm not familiar with vge(4) and don't have hardwares supported by
>  > >  > > >  > > >vge(4). Because vge(4) supports a kind of interrupt moderation, there
>  > >  > > >  > > >is a possiblity to have the same issue seen on em(4).
>  > >  > > >  > > >If you want my blind patch I can send a patch for you.
>  > >  > > >  > > >
>  > >  > > >  > > Yes, please!
>  > >  > > >  > > I can test it (on RELENG_6 though).
>  > >  > > >  > 
>  > >  > > >  > I have an idea why those timeouts can happen. Could you please test
>  > >  > > >  > attached patch? It may help (or may not). Anyway would be fine
>  > >  > > >  > to know results.
>  > >  > > >  > 
>  > >  > > > 
>  > >  > > > Since vge(4) uses MTX_RECURSE mutex and miibus(4) handler is
>  > >  > > > protected with the mutex I guess it wouldn't help much.
>  > >  > > > I guess it needs a seperate mutex to protect miibus(4) handler
>  > >  > > > and should remove the use of MTX_RECURSE.
>  > >  > > 
>  > >  > > Hmm.
>  > >  > > 1) _ifmedia_upd() & _ifmedia_sts() functions are not called from mii layer.
>  > >  > > 2) As i can see MII layer is not protected by anything, unless you
>  > >  > > specially acquire driver lock prior to calling mii_ function.
>  > >  > > Locking ifmedia callbacks should be done (though, it may not help
>  > >  > > with watchdogs timeout), otherwise we have race on accessing PHY registers.
>  > >  > > (kern/98738).
>  > >  > > 
>  > >  > > As i can see, random watchdog timeouts was reported for em, bge, vge, sk
>  > >  > > (and maybe others, those ones which i remember) drivers.
>  > >  > > All of them has unlocked _ifmedia_ functions.
>  > >  > > 
>  > >  > > My idea was: perhaps, under certain condition, concurrent access to PHY could
>  > >  > > lead to hardware deadlock.
>  > >  > > 
>  > >  > > 
>  > >  > > > vge(4) also has a bug
>  > >  > > > if mbuf chain is too long(7 or higher) and defragmentation with
>  > >  > > > m_defrag(9) fails it would access an invalid mbuf chain.
>  > >  > > > All these requires lots of work and need a real hardware.
>  > >  > > > Oleg, if you have hardware, would you fix it?
>  > >  > > 
>  > >  > > Unfortunately i don't have vge hardware.
>  > >  > > > 
>  > >  > > > -- 
>  > >  > > > Regards,
>  > >  > > > Pyun YongHyeon
>  > >  > > 
>  > >  > > -- 
>  > >  > > Oleg.
>  > >  > > 
>  > >  > 
>  > >  > Forgot one thing: i think we need no dedicated mutex for mii layer if we lock
>  > >  > ifmedia callbacks.
>  > >  > 
>  > > 
>  > > If we use the diver mutex in MII access it would require MTX_RECURSE
>  > > mutex. I want simple MTX_DEF mutex.
>  > 
>  > Could you please explain why MTX_RECURSE is required?
>  > 
> 
> I can't remember what caused this. Need more coffee. :-(
> If my memory serve me right it's related with ioctls.
> I guess you can easily experiment with removing MTX_RECURSE flag
> in the driver.

The fix for that is to not hold the driver lock in the ioctl routine
when you call the mii function for media changes, but to only hold
the lock in the ifmedia callouts.  This is what I did in the several
device drivers I locked or fixed the locking in.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200608251022.46120.jhb>