Date: Mon, 23 May 2005 12:18:02 +0200 From: Palle Girgensohn <girgen@FreeBSD.org> To: Palle Girgensohn <girgen@FreeBSD.org>, stable@freebsd.org, amd64@freebsd.org Subject: system does not reboot after panic (Was: savecore: first and last dump headers disagree) Message-ID: <6950A47790357A4F74A4572E@palle.girgensohn.se> In-Reply-To: <1266D69ABEC58B97FBFE536F@palle.girgensohn.se> References: <1266D69ABEC58B97FBFE536F@palle.girgensohn.se>
next in thread | previous in thread | raw e-mail | index | archive | help
--On m=E5ndag, maj 23, 2005 10.30.47 +0200 Palle Girgensohn=20 <girgen@FreeBSD.org> wrote: > Hi! > > We have an amd64 system that still experiences crashes after installing > 5.4, mostly during high loads. (It's been unstable all the time, really; > see previous posts.) > > I've added dumpdev=3D"/dev/amrd0s2b", and some time ago I did get > coredumps, but with latest versions of the kernel, savecore does not give > me a dump, instead it says: > > savecore: first and last dump headers disagree on /dev/amrd0s2b > savecore: unsaved dumps found but not saved > > What can I do to fix this? I guess I need a core dump to proceed in > finding the problem? Peter Holm tipped me of using savecore -f. Hopefully this will give me a=20 core next time. This one was already destroyed by swapping. :( > Also, the machine does not reboot after a panic, that's an even bigger > problem, really, it needs console hands-on to revive every time. This is really *the* main issue. It won't reboot automatically, it just=20 sits there waiting for keyboard action... :( there is no debugger in the=20 kernel, would adding kbd and kbd_unattende help? I doubt it? Anything else=20 that can be done? /Palle > Last time it crashed (last week, before updating to 5.4-RELEASE, that > system was a few weeks older on the RELENG_5_4 branch), it seems to have > get stuck on dumping the core, can this be the problem: > > ------- > Fatal trap 12: page fault while in kernel mode > cpuid =3D 0: apic id =3D 00 > fault virtual address =3D 0x00 > ... > trap number =3D 12 > panic: page fault > cpuid =3D 0 > boot() called on cpu#0 > Uptime: 1d23h50m36s > Dumping 2047 MB > 16 32 > -------- > The cursor sits at the position after "32". > > Seems to me it fails to dump the core, can this be it? On previous > crashes, before dumpdev was set, it would hang before that > > The machine is Dell 2850 w/ Perc raid, Dual CPUs, SMP with hyperthreading > OFF in BIOS. Enclosing the KERNEL config, almost a GENERIC kernel. I can > provide more info if required. > > So, in short, three question, really. > > - How can I get rid of the crashes? (heh) > - How can I get the system to do unattended reboot when crashed? > - How do I get a coredump? > Any help appreciated. > > /Palle > > > Diffing GENERIC vs KERNEL: > > --- GENERIC Tue Apr 12 15:57:01 2005 > +++ KERNEL Fri Apr 29 22:27:41 2005 > @@ -20,7 +20,9 @@ > > machine amd64 > cpu HAMMER > -ident GENERIC > +ident KERNEL > + > +makeoptions DEBUG=3D-g > > # To statically compile in device wiring instead of /boot/device.hints > #hints "GENERIC.hints" # Default places to look for > devices. > @@ -45,7 +47,7 @@ > options COMPAT_43 # Needed by COMPAT_LINUX32 > options COMPAT_IA32 # Compatible with i386 binaries > options COMPAT_FREEBSD4 # Compatible with FreeBSD4 > -options COMPAT_LINUX32 # Compatible with i386 linux > binaries > +#options COMPAT_LINUX32 # Compatible with i386 linux > binaries > options SCSI_DELAY=3D15000 # Delay (in ms) before probing > SCSI > options KTRACE # ktrace(1) support > options SYSVSHM # SYSV-style shared memory > @@ -64,10 +66,10 @@ > > # Enabling NO_MIXED_MODE gives a performance improvement on some > motherboards > # but does not work with some boards (mostly nVidia chipset based). > -#options NO_MIXED_MODE # Don't penalize working chipsets > +options NO_MIXED_MODE # Don't penalize working chipsets > > # Linux 32-bit ABI support > -options LINPROCFS # Cannot be a module yet. > +#options LINPROCFS # Cannot be a module yet. > > # Bus support. Do not remove isa, even if you have no isa slots > device acpi > @@ -260,3 +262,19 @@ > device firewire # FireWire bus code > device sbp # SCSI over FireWire (Requires scbus and > da) > device fwe # Ethernet over FireWire (non-standard!) > + > +# SMP > +options SMP > + > +# SysV stuff > +# This provides support for System V shared memory. > +# > +options SYSVSHM > +options SYSVSEM > +options SYSVMSG > +options SHMMAXPGS=3D65536 > +options SEMMNI=3D40 > +options SEMMNS=3D240 > +options SEMUME=3D40 > +options SEMMNU=3D120
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6950A47790357A4F74A4572E>