Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Apr 2003 17:47:32 -0400
From:      Don Bowman <don@sandvine.com>
To:        'John Polstra' <jdp@polstra.com>, net@freebsd.org
Subject:   RE: em net (optical GigE) driver hangs?
Message-ID:  <FE045D4D9F7AED4CBFF1B3B813C8533701B36384@mail.sandvine.com>

next in thread | raw e-mail | index | archive | help
From: John Polstra [mailto:jdp@polstra.com]
> Sent: April 22, 2003 16:12
> To: net@freebsd.org
> Subject: Re: em net (optical GigE) driver hangs?
> 
> 
> In article 
> <FE045D4D9F7AED4CBFF1B3B813C8533701918A83@mail.sandvine.com>,
> Dave Dolson  <ddolson@sandvine.com> wrote:
> > 
> > Has anyone experienced em interface hangs after approx 
> several days of heavy
> > operation?
> > 
> > We are using a system which is mostly RELENG_4_7, using 
> multiple optical em
> > GigE devices.
> > 
> > The symptom is that the interface stops transmitting or 
> receiving, reporting
> > drops on output (no tx descriptors) and input errors (MPC 
> stat-->no receive
> > descriptors).
> > 
> > It turns out that all but 64 transmit descriptors are in 
> use.  The driver is
> > waiting for the "done" flag to be set so it can clean the 
> descriptors.
> > The device is also in the OACTIVE state at this time.
> > 
> > After the interface is brought down (or unplugged), the em 
> watchdog timer
> > goes off 5s later.
> > 
> > We are trying to figure out two things:
> > 1. why did the driver lock up?
> > 2. why didn't the watchdog timer go off earlier?
> > 
> > I think we would be happy to solve #2 given the rarity of the event.
> > Is the RELENG_4 version likely to fix the problem?
> 
> I think the RELENG_4 version is likely to eliminate the problem.  See
> the comment near the define of EM_RDTR in if_em.h (in the RELENG_4
> version of that file, of course).

We saw that, but we are using DEVICE_POLLING, so assumed it was not
the issue. We think instead its another problem, which is also solved
in the RELENG_4 driver, in that em_poll() calls em_start() if device is 
running and there are pkts on the queue. em_start() re-arms the timer, 
holding off the wdog forever.

--don



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FE045D4D9F7AED4CBFF1B3B813C8533701B36384>