Date: Wed, 5 Sep 2001 10:25:22 -0500 (CDT) From: presence <kkanno@churchofinformationwarfare.org> To: <freebsd-stable@FreeBSD.ORG> Subject: Re: rebooting under load [solved] Message-ID: <Pine.BSF.4.33L2.0109051024460.372-100000@centipede.symmetric.net> In-Reply-To: <20010904141119.A34517@mikea.ath.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
After swapping out all memory and going back the the single OEM CPU the problems persisted. I then updated to 4.4-RC from Sep 4, 2001 and the machine stopped crashing. Back in SMP with all original hardware everything seems OK now. The old kernel was 4.3-RELEASE cvsuped from Aug 2, 2001. KEN On Tue, 4 Sep 2001, mikea wrote: > On Tue, Sep 04, 2001 at 01:57:06PM -0500, presence wrote: > > I've got a machine that has been suddenly rebooting. I can make it crash > > at will by bringing the load to about 200 with the script below. My other > > single CPU boxes can handle a this script with 3000 primes instances just > > fine with 512MB of RAM. > > > > Here is what I get whe it goes down. What does it mean? > > > > bash-2.05# panic: vm_fault: fault on nofault entry, addr: cbbd3000 > > mp_lock = 01000002; cpuid = 1; lapic.id = 00000000 > > boot() called on cpu#1 > > > > syncing disks... 15 > > done > > Uptime: 5m13s > > Automatic reboot in 15 seconds - press a key on the console to abort > > [snip dmesg output] > > > Program that when run twice crashes my machine, even before it seems to > > swap out. > > > > #!/usr/bin/perl > > # > > # Ken Kanno 08-31-2001 > > # I want a load average of 1000 > > # for fun > > > > for($a=0; $a<100; $a++) > > { > > print "starting primes instance # :$a \n"; > > system "nice -20 primes 10000 > /dev/null &"; > > #sleep 1; > > > > } > > Sounds like you might have something (CPU? memory?) just on the > edge of failing, and this load pushes it over the edge by heating > (or driving too fast) the near-failing component. > > What happens when you do a "make -j 16 buildworld"? > > Are all your fans working? Does removing a stick of memory cause > it to _not_ fail? Is your power supply overloaded or marginal? > Are you overclocking the CPUs? If you are, then does it fail at > normal clock rates? Does opening the case and pointing a _big_ fan > at the motherboard change things? > > Just some things to try; no guarantee that these tests will find > the problem, but they might, and they're easy. Others may have > more or better ideas. > > -- > Mike Andrews > mikea@mikea.ath.cx > Tired old sysadmin since 1964 > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-stable" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.33L2.0109051024460.372-100000>