Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 08 Oct 2006 23:11:12 +0200
From:      Michal Mertl <mime@traveller.cz>
To:        Kris Kennaway <kris@obsecurity.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: em, bge, network problems survey.
Message-ID:  <1160341872.93717.0.camel@genius.i.cz>
In-Reply-To: <20061006182620.GB16605@xor.obsecurity.org>
References:  <45244053.6030706@samsco.org> <20061005200552.GA80162@xor.obsecurity.org> <1160117675.10606.17.camel@genius.i.cz> <20061006182620.GB16605@xor.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Kris Kennaway wrote: 
> On Fri, Oct 06, 2006 at 08:54:35AM +0200, Michal Mertl wrote:
> > Kris Kennaway wrote:
> > > On Wed, Oct 04, 2006 at 05:14:27PM -0600, Scott Long wrote:
> > > > All,
> > > > 
> > > > I'm seeing some patterns here with all of the network driver problem 
> > > > reports, but I need more information to help narrow it down further.
> > > > I ask all of you who are having problems to take a minute to fill
> > > > out this survey and return it to Kris Kennaway (on cc:) and myself.
> > > > Thanks.
> > > > 
> > > > 1. Are you experiencing network hangs and/or "timeout" messages on the 
> > > > console?  If yes, please provide a _brief_ description of the problem.
> > > 
> > > OK, next question, to all em users:
> > > 
> > > If your em device is using a shared interrupt, and you are NOT
> > > experiencing timeout problems when using this device, please let me
> > > know.
> > 
> > I haven't seen any timeout message in long time but I experience frozen
> > network (and also the already reported panic when doing ifconfig down/up
> > then).
> 
> Are these details in a PR?

I don't know. As I have seen somebody else reporting the same issue
(even with backtrace) and the problem was believed to be understood I
dind't pay much attention, sorry.

> 
> > I have also seen strange problem which may be completely unrelated: When
> > doing 'find . -ls' on SMB mounted drive - find was spitting the contents
> > of the drive but never finishes. Network seemed dead but when I
> > interrupted find with Ctrl-C I got the replies to the pings sent when it
> > was running (e.g. thousands ms) - this looks like something was
> > preventing RX to work and the packets were just queued somewhere. I
> > belive I should be able to easily reproduce it.
> > 
> > genius# vmstat -i
> > interrupt                          total       rate
> > irq0: clk                       43784465       1000
> > irq1: atkbd0                       66248          1
> > irq5: pcm0                          5877          0
> > irq8: rtc                        5603682        128
> > irq9: acpi0                         8820          0
> > irq11: fwohci0 em*                205749          4
> > irq12: psm0                       586848         13
> > irq14: ata0                       340844          7
> > irq15: ata1                           61          0
> > Total                           50602594       1155
> > 
> > I don't think I remember debug.mpsafenet tunable being mentioned in the
> > threads about the problems. It prevents all the problems on my system
> > (UP non-APIC system), including the SMB issue mentioned above.
> 
> I suspect both of your problems are some unrelated issue.  I'd need
> root access & a test setup before I can say more though.

It is possible, but your patch to change INTR_FAST to INTR_MPSAFE seems
to help with both of these too.

I am planning to try to reproduce the SMB temporary network lockup (with
original CURRENT driver) and probably do a panic then to get a core.

Michal




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1160341872.93717.0.camel>