Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Sep 2007 10:58:22 -0400
From:      "Benjie Chen" <benjie@addgene.org>
To:        "Kris Kennaway" <kris@freebsd.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Kernel panic on PowerEdge 1950 under certain stress load
Message-ID:  <c53be070709250758v4c8da54ft7a1d6bbc1400368f@mail.gmail.com>
In-Reply-To: <46F8D12E.7060202@FreeBSD.org>
References:  <c53be070709211526j2178ebb7ia6ea39e1a5df303c@mail.gmail.com> <fd84qf$ejl$1@sea.gmane.org> <c53be070709240842h6875d45ct761d0fa5790f70e2@mail.gmail.com> <46F8D12E.7060202@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
You are right, they may not be the same. From first look it seems like they
are similar based on the description of the problems -- system stable, then
under load related to network, get panic after different time intervals. I
just assumed that kernel is typically stable enough that this kind of panic
are rare (been using FBSD for 7 or 8 years now and in heavy loads as well,
never had kernel panics to deal with).

Upon closer look at the trace and the problem, they may not be the same,
since one on those web pages was about the route code and my breaks only in
one place - waiting for a lock. Again, I will see if I could get a dump when
I return to the office.

I did reboot the system and set mpsafenet to 0 and I have not had a crash
since then (almost a day) running the same load, so that's positive: at
least it may be that that's the workaround, and I don't need Dell to send me
new memory modules to try...

Kris or Ivan: I was wondering if you could briefly explain what your guess
the problem might be. I am curious what the cause of the problem is. E.g. it
seems like a race condition, but I am curious to know more of the details...


Thanks,
Benjie



On 9/25/07, Kris Kennaway <kris@freebsd.org> wrote:
>
> Benjie Chen wrote:
> > Ivan and Kris,
> >
> > I will try to get a kernel trace -- it may not happen for awhile since I
> am
> > not in the office and working remotely for awhile so it may not be easy
> to
> > get a trace... but I will check.
> >
> > It looks like the problem reported by that link, and some of the links
> from
> > there though...
>
> Does it really? i.e. did you compare the function names in detail and
> find that they match precisely, or do you just mean "they are both
> panics of some description and I dunno what it all means"? :)  I ask
> because the linked trace does not involve a spinlock, which means it
> cannot be precisely the same trace.
>
> Kris
>
>


-- 
Benjie Chen, Ph.D.
Addgene, a better way to share plasmids
www.addgene.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?c53be070709250758v4c8da54ft7a1d6bbc1400368f>