Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Sep 2001 10:25:22 -0500 (CDT)
From:      presence <kkanno@churchofinformationwarfare.org>
To:        <freebsd-stable@FreeBSD.ORG>
Subject:   Re: rebooting under load [solved]
Message-ID:  <Pine.BSF.4.33L2.0109051024460.372-100000@centipede.symmetric.net>
In-Reply-To: <20010904141119.A34517@mikea.ath.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
After swapping out all memory and going back the the single OEM CPU the
problems persisted. I then updated to 4.4-RC from Sep 4, 2001 and the
machine stopped crashing. Back in SMP with all original hardware
everything seems OK now.

The old kernel was 4.3-RELEASE cvsuped from Aug 2, 2001.

KEN


On Tue, 4 Sep 2001, mikea wrote:

> On Tue, Sep 04, 2001 at 01:57:06PM -0500, presence wrote:
> > I've got a machine that has been suddenly rebooting. I can make it crash
> > at will by bringing the load to about 200 with the script below. My other
> > single CPU boxes can handle a this script with 3000 primes instances just
> > fine with 512MB of RAM.
> >
> > Here is what I get whe it goes down. What does it mean?
> >
> > bash-2.05# panic: vm_fault: fault on nofault entry, addr: cbbd3000
> > mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
> > boot() called on cpu#1
> >
> > syncing disks... 15
> > done
> > Uptime: 5m13s
> > Automatic reboot in 15 seconds - press a key on the console to abort
>
> [snip dmesg output]
>
> > Program that when run twice crashes my machine, even before it seems to
> > swap out.
> >
> > #!/usr/bin/perl
> > #
> > # Ken Kanno 08-31-2001
> > # I want a load average of 1000
> > # for fun
> >
> > for($a=0; $a<100; $a++)
> > {
> >     print "starting primes instance # :$a \n";
> >     system "nice -20 primes 10000 > /dev/null &";
> >     #sleep 1;
> >
> > }
>
> Sounds like you might have something (CPU? memory?) just on the
> edge of failing, and this load pushes it over the edge by heating
> (or driving too fast) the near-failing component.
>
> What happens when you do a "make -j 16 buildworld"?
>
> Are all your fans working? Does removing a stick of memory cause
> it to _not_ fail? Is your power supply overloaded or marginal?
> Are you overclocking the CPUs? If you are, then does it fail at
> normal clock rates? Does opening the case and pointing a _big_ fan
> at the motherboard change things?
>
> Just some things to try; no guarantee that these tests will find
> the problem, but they might, and they're easy. Others may have
> more or better ideas.
>
> --
> Mike Andrews
> mikea@mikea.ath.cx
> Tired old sysadmin since 1964
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-stable" in the body of the message
>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.33L2.0109051024460.372-100000>