From owner-freebsd-amd64@FreeBSD.ORG Mon May 23 10:18:05 2005 Return-Path: X-Original-To: amd64@freebsd.org Delivered-To: freebsd-amd64@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 567BB16A41C; Mon, 23 May 2005 10:18:05 +0000 (GMT) (envelope-from girgen@FreeBSD.org) Received: from melon.pingpong.net (82.milagro.bahnhof.net [195.178.168.82]) by mx1.FreeBSD.org (Postfix) with ESMTP id E111B43D1D; Mon, 23 May 2005 10:18:04 +0000 (GMT) (envelope-from girgen@FreeBSD.org) Received: from localhost (localhost.pingpong.net [127.0.0.1]) by melon.pingpong.net (Postfix) with ESMTP id 4DA8B4B626; Mon, 23 May 2005 12:18:03 +0200 (CEST) Received: from melon.pingpong.net ([127.0.0.1]) by localhost (melon.pingpong.net [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 18448-01-11; Mon, 23 May 2005 12:18:03 +0200 (CEST) Received: from palle.girgensohn.se (1-2-8-5a.asp.sth.bostream.se [82.182.157.66]) by melon.pingpong.net (Postfix) with ESMTP id E85D04B624; Mon, 23 May 2005 12:18:02 +0200 (CEST) Date: Mon, 23 May 2005 12:18:02 +0200 From: Palle Girgensohn To: Palle Girgensohn , stable@freebsd.org, amd64@freebsd.org Message-ID: <6950A47790357A4F74A4572E@palle.girgensohn.se> In-Reply-To: <1266D69ABEC58B97FBFE536F@palle.girgensohn.se> References: <1266D69ABEC58B97FBFE536F@palle.girgensohn.se> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Virus-Scanned: by amavisd-new at pingpong.net Cc: Subject: system does not reboot after panic (Was: savecore: first and last dump headers disagree) X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 May 2005 10:18:05 -0000 --On m=E5ndag, maj 23, 2005 10.30.47 +0200 Palle Girgensohn=20 wrote: > Hi! > > We have an amd64 system that still experiences crashes after installing > 5.4, mostly during high loads. (It's been unstable all the time, really; > see previous posts.) > > I've added dumpdev=3D"/dev/amrd0s2b", and some time ago I did get > coredumps, but with latest versions of the kernel, savecore does not give > me a dump, instead it says: > > savecore: first and last dump headers disagree on /dev/amrd0s2b > savecore: unsaved dumps found but not saved > > What can I do to fix this? I guess I need a core dump to proceed in > finding the problem? Peter Holm tipped me of using savecore -f. Hopefully this will give me a=20 core next time. This one was already destroyed by swapping. :( > Also, the machine does not reboot after a panic, that's an even bigger > problem, really, it needs console hands-on to revive every time. This is really *the* main issue. It won't reboot automatically, it just=20 sits there waiting for keyboard action... :( there is no debugger in the=20 kernel, would adding kbd and kbd_unattende help? I doubt it? Anything else=20 that can be done? /Palle > Last time it crashed (last week, before updating to 5.4-RELEASE, that > system was a few weeks older on the RELENG_5_4 branch), it seems to have > get stuck on dumping the core, can this be the problem: > > ------- > Fatal trap 12: page fault while in kernel mode > cpuid =3D 0: apic id =3D 00 > fault virtual address =3D 0x00 > ... > trap number =3D 12 > panic: page fault > cpuid =3D 0 > boot() called on cpu#0 > Uptime: 1d23h50m36s > Dumping 2047 MB > 16 32 > -------- > The cursor sits at the position after "32". > > Seems to me it fails to dump the core, can this be it? On previous > crashes, before dumpdev was set, it would hang before that > > The machine is Dell 2850 w/ Perc raid, Dual CPUs, SMP with hyperthreading > OFF in BIOS. Enclosing the KERNEL config, almost a GENERIC kernel. I can > provide more info if required. > > So, in short, three question, really. > > - How can I get rid of the crashes? (heh) > - How can I get the system to do unattended reboot when crashed? > - How do I get a coredump? > Any help appreciated. > > /Palle > > > Diffing GENERIC vs KERNEL: > > --- GENERIC Tue Apr 12 15:57:01 2005 > +++ KERNEL Fri Apr 29 22:27:41 2005 > @@ -20,7 +20,9 @@ > > machine amd64 > cpu HAMMER > -ident GENERIC > +ident KERNEL > + > +makeoptions DEBUG=3D-g > > # To statically compile in device wiring instead of /boot/device.hints > #hints "GENERIC.hints" # Default places to look for > devices. > @@ -45,7 +47,7 @@ > options COMPAT_43 # Needed by COMPAT_LINUX32 > options COMPAT_IA32 # Compatible with i386 binaries > options COMPAT_FREEBSD4 # Compatible with FreeBSD4 > -options COMPAT_LINUX32 # Compatible with i386 linux > binaries > +#options COMPAT_LINUX32 # Compatible with i386 linux > binaries > options SCSI_DELAY=3D15000 # Delay (in ms) before probing > SCSI > options KTRACE # ktrace(1) support > options SYSVSHM # SYSV-style shared memory > @@ -64,10 +66,10 @@ > > # Enabling NO_MIXED_MODE gives a performance improvement on some > motherboards > # but does not work with some boards (mostly nVidia chipset based). > -#options NO_MIXED_MODE # Don't penalize working chipsets > +options NO_MIXED_MODE # Don't penalize working chipsets > > # Linux 32-bit ABI support > -options LINPROCFS # Cannot be a module yet. > +#options LINPROCFS # Cannot be a module yet. > > # Bus support. Do not remove isa, even if you have no isa slots > device acpi > @@ -260,3 +262,19 @@ > device firewire # FireWire bus code > device sbp # SCSI over FireWire (Requires scbus and > da) > device fwe # Ethernet over FireWire (non-standard!) > + > +# SMP > +options SMP > + > +# SysV stuff > +# This provides support for System V shared memory. > +# > +options SYSVSHM > +options SYSVSEM > +options SYSVMSG > +options SHMMAXPGS=3D65536 > +options SEMMNI=3D40 > +options SEMMNS=3D240 > +options SEMUME=3D40 > +options SEMMNU=3D120