Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Jul 2009 18:34:15 -0400
From:      "Alexandre \"Sunny\" Kovalenko" <gaijin.k@gmail.com>
To:        Anton Shterenlikht <mexas@bristol.ac.uk>
Cc:        "O. Hartmann" <ohartman@mail.zedat.fu-berlin.de>, freebsd-current@freebsd.org, "O. Hartmann" <ohartman@zedat.fu-berlin.de>, freebsd-ia64@freebsd.org
Subject:   Re: FreeBSD 8.0-BETA2/amd64 crashes on SMP under load
Message-ID:  <1248906855.1459.8.camel@RabbitsDen>
In-Reply-To: <20090728144555.GD75439@mech-cluster241.men.bris.ac.uk>
References:  <4A6DB30B.20705@zedat.fu-berlin.de> <4A6DB9F1.7050404@haruhiism.net> <4A6E0620.6070200@mail.zedat.fu-berlin.de> <20090727210428.GA30253@mech-cluster241.men.bris.ac.uk> <20090728103545.GA22380@mech-cluster241.men.bris.ac.uk> <4A6F09BA.2020703@zedat.fu-berlin.de> <20090728144555.GD75439@mech-cluster241.men.bris.ac.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 2009-07-28 at 15:45 +0100, Anton Shterenlikht wrote:
> On Tue, Jul 28, 2009 at 02:22:50PM +0000, O. Hartmann wrote:
> > Anton Shterenlikht wrote:
> > > On Mon, Jul 27, 2009 at 10:04:28PM +0100, Anton Shterenlikht wrote:
> > >> On Mon, Jul 27, 2009 at 09:55:12PM +0200, O. Hartmann wrote:
> > >>> Kamigishi Rei wrote:
> > >>>> O. Hartmann wrote:
> > >>>>> I have the problem of crashing FreeBSD 8.0-BETA2/amd64 under load on
> > >>>>> all of our SMP boxes. Is there an issue known at the moment? If not, I
> > >>>>> will prepare the kernel for whitnessing and provide more informations,
> > >>>>> if you wish.
> > >>>> A quick question: what is in the crash message, i.e. the backtrace?
> > >>>> And what kind of crash is it - a panic() or a fatal trap?
> > >>> On the 8-core server box, I sometimes see :
> > >>>
> > >>> Fatal trap 12: page fault while in kernel mode
> > >>> fault code              = supervisor read, page not present
> > >> Not sure if it's related, but on ia64 SMP (2 cpus) with 8.0-current and
> > >> later with 8.0-beta1 (I havent' built beta2 yet) I'm getting crashes
> > >> under load every so often. E.g buildworld -j8 is likely to crash the
> > >> box. No messages, just a sudden freeze, no backtrace or panic, and then reboot.
> > >>
> > >> If load is less heavy, e.g. fewer processes and some idle time, the
> > >> problem doesn't seem to appear.
> > >>
> > >> I'm happy to do any further testing, if suggested.
> > > 
> > > my ia64 8.0-beta1 SMP box died again on
> > > make -j8 buildworld
> > > with no panic or log entries.
> > > 
> > > Is it possible that some kernel variable needs to
> > > be increased? E.g. kern.maxproc, kern.maxfiles, etc.
> > > Or perhaps I'm talking complete rubbish..
> > > 
> > 
> > I suggest you try again with a UP kernel - a suggestion from a 
> > kernel-nnob, sorry. My SMP boxes work now with UP-kernel, but they are 
> > really slowish although they have modern Intel C2D/Penryn cores.
> 
> I need SMP for OpenMP codes. It's a shame if SMP is buggy, but
> I guess all is down to small user base..
> 
Before you go down that path, which, IMHO, is as counterproductive as it
is incorrect, could you, please, show the output of 

sysctl debug | grep panic

and check whether output of 

savecore -vC

makes sense to you.

-- 
Alexandre Kovalenko (Олександр Коваленко)





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1248906855.1459.8.camel>