Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 May 2005 12:18:02 +0200
From:      Palle Girgensohn <girgen@FreeBSD.org>
To:        Palle Girgensohn <girgen@FreeBSD.org>, stable@freebsd.org, amd64@freebsd.org
Subject:   system does not reboot after panic (Was: savecore: first and last dump headers disagree)
Message-ID:  <6950A47790357A4F74A4572E@palle.girgensohn.se>
In-Reply-To: <1266D69ABEC58B97FBFE536F@palle.girgensohn.se>
References:  <1266D69ABEC58B97FBFE536F@palle.girgensohn.se>

next in thread | previous in thread | raw e-mail | index | archive | help
--On m=E5ndag, maj 23, 2005 10.30.47 +0200 Palle Girgensohn=20
<girgen@FreeBSD.org> wrote:

> Hi!
>
> We have an amd64 system that still experiences crashes after installing
> 5.4, mostly during high loads. (It's been unstable all the time, really;
> see previous posts.)
>
> I've added dumpdev=3D"/dev/amrd0s2b", and some time ago I did get
> coredumps, but with latest versions of the kernel, savecore does not give
> me a dump, instead it says:
>
> savecore: first and last dump headers disagree on /dev/amrd0s2b
> savecore: unsaved dumps found but not saved
>
> What can I do to fix this? I guess I need a core dump to proceed in
> finding the problem?


Peter Holm tipped me of using savecore -f. Hopefully this will give me a=20
core next time. This one was already destroyed by swapping. :(


> Also, the machine does not reboot after a panic, that's an even bigger
> problem, really, it needs console hands-on to revive every time.

This is really *the* main issue. It won't reboot automatically, it just=20
sits there waiting for keyboard action... :(  there is no debugger in the=20
kernel, would adding kbd and kbd_unattende help? I doubt it? Anything else=20
that can be done?

/Palle


> Last time it crashed (last week, before updating to 5.4-RELEASE, that
> system was a few weeks older on the RELENG_5_4 branch), it seems to have
> get stuck on dumping the core, can this be the problem:
>
> -------
> Fatal trap 12: page fault while in kernel mode
> cpuid =3D 0: apic id =3D 00
> fault virtual address    =3D 0x00
> ...
> trap number              =3D 12
> panic: page fault
> cpuid =3D 0
> boot() called on cpu#0
> Uptime: 1d23h50m36s
> Dumping 2047 MB
>  16 32
> --------
> The cursor sits at the position after "32".
>
> Seems to me it fails to dump the core, can this be it? On previous
> crashes, before dumpdev was set, it would hang before that
>
> The machine is Dell 2850 w/ Perc raid, Dual CPUs, SMP with hyperthreading
> OFF in BIOS. Enclosing the KERNEL config, almost a GENERIC kernel. I can
> provide more info if required.
>
> So, in short, three question, really.
>
> - How can I get rid of the crashes? (heh)
> - How can I get the system to do unattended reboot when crashed?
> - How do I get a coredump?
> Any help appreciated.
>
> /Palle
>
>
> Diffing GENERIC vs KERNEL:
>
> --- GENERIC      Tue Apr 12 15:57:01 2005
> +++ KERNEL       Fri Apr 29 22:27:41 2005
> @@ -20,7 +20,9 @@
>
>  machine         amd64
>  cpu             HAMMER
> -ident           GENERIC
> +ident           KERNEL
> +
> +makeoptions     DEBUG=3D-g
>
>  # To statically compile in device wiring instead of /boot/device.hints
>  #hints          "GENERIC.hints"         # Default places to look for
> devices.
> @@ -45,7 +47,7 @@
>  options         COMPAT_43               # Needed by COMPAT_LINUX32
>  options         COMPAT_IA32             # Compatible with i386 binaries
>  options         COMPAT_FREEBSD4         # Compatible with FreeBSD4
> -options         COMPAT_LINUX32          # Compatible with i386 linux
> binaries
> +#options        COMPAT_LINUX32          # Compatible with i386 linux
> binaries
>  options         SCSI_DELAY=3D15000        # Delay (in ms) before probing
> SCSI
>  options         KTRACE                  # ktrace(1) support
>  options         SYSVSHM                 # SYSV-style shared memory
> @@ -64,10 +66,10 @@
>
>  # Enabling NO_MIXED_MODE gives a performance improvement on some
> motherboards
>  # but does not work with some boards (mostly nVidia chipset based).
> -#options        NO_MIXED_MODE   # Don't penalize working chipsets
> +options         NO_MIXED_MODE   # Don't penalize working chipsets
>
>  # Linux 32-bit ABI support
> -options         LINPROCFS               # Cannot be a module yet.
> +#options        LINPROCFS               # Cannot be a module yet.
>
>  # Bus support.  Do not remove isa, even if you have no isa slots
>  device          acpi
> @@ -260,3 +262,19 @@
>  device          firewire        # FireWire bus code
>  device          sbp             # SCSI over FireWire (Requires scbus and
> da)
>  device          fwe             # Ethernet over FireWire (non-standard!)
> +
> +# SMP
> +options         SMP
> +
> +# SysV stuff
> +# This provides support for System V shared memory.
> +#
> +options         SYSVSHM
> +options         SYSVSEM
> +options         SYSVMSG
> +options         SHMMAXPGS=3D65536
> +options         SEMMNI=3D40
> +options         SEMMNS=3D240
> +options         SEMUME=3D40
> +options         SEMMNU=3D120







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6950A47790357A4F74A4572E>