Date: Mon, 10 Jan 2011 18:13:16 -0800 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: Mark Saad <nonesuch@longcount.org> Cc: stable@freebsd.org Subject: Re: Enabling DDB prevent kernel from panicing Message-ID: <20110111021316.GA84376@icarus.home.lan> In-Reply-To: <AANLkTimBf1BxuJK4h5WJmreqAo4BQz_TwAbrPvNwYqm=@mail.gmail.com> References: <AANLkTinp76kxbRu6y0=Qfe9PiuDUPiUuU7zbQ24nkp8B@mail.gmail.com> <AANLkTimaaM6Vb-V4-yyocJKax9mFSQxtAJw5mEom=AC-@mail.gmail.com> <AANLkTimBf1BxuJK4h5WJmreqAo4BQz_TwAbrPvNwYqm=@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jan 10, 2011 at 07:42:21PM -0500, Mark Saad wrote: > On Mon, Jan 10, 2011 at 6:59 PM, <nickolasbug@gmail.com> wrote: > > Hello, Mark > > > > 2011/1/11 Mark Saad <nonesuch@longcount.org>: > >> All > >> This was originally posted to hackers@ > >> > >> I have a good question that I cant find an answer for. I believe > >> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit > >> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page > >> fault while in kernel mode " . The hardware works fine in 7.2-RELEASE > >> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 . > >> > >> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the > >> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this > >> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC > >> kernel using patches sources and tried to boot and I got the same > >> crash. > >> > >> Next I rebuilt the kernel with KDB and DDB to see if I could get a > >> core-dump of the system. I also set loader.conf to > >> > >> kernel="kernel.DEBUG" > >> kern.dumpdev="/dev/da0s1b" > >> > >> Next I pxebooted the box and the system does not crash on boot up, it > >> will easily load a nfs root and work fine. So I copied my debug > >> kernel, and loader.conf to the local disk and rebooted and it boots > >> fine from the local disk . > > > > Looks like a race condition. > > Well, you don't need to compile KDB and DDB, just add > > > > makeoptions DEBUG=-g > > > > into your kernel config file and rebuild kernel. > > > > Then after you got a crash dump you can easy debug it (see FreeBSD > > Developers Handbok): > > http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html > > > > > > wbr, > > Nickolas > > > > Sorry let me clarify the issue, When you install a generic > 7.3-RELEASE amd64 on some of the HP servers I use, the kernel panics > in boot up > when it probes the sio driver . Here is a part of my dmesg.boot file > > atkbd0: [ITHREAD] > psm0: <PS/2 Mouse> irq 12 on atkbdc0 > psm0: [GIANT-LOCKED] > psm0: [ITHREAD] > psm0: model Generic PS/2 mouse, device ID 0 > sio0: configured irq 4 not in bitmap of probed irqs 0 > sio0: port may not be enabled > sio0: configured irq 4 not in bitmap of probed irqs 0 > sio0: port may not be enabled > sio0: <Standard PC COM port> port 0x3f8-0x3ff irq 4 on acpi0 > sio0: type 16550A > sio0: [FILTER] > Say about here in the boot up , is where the box crashes with the > above noted error. > > If I then boot the same box off a 7.1-RELEASE amd64 netboot server , > mount the local disks of the 7.3-RELEASE install and edit the > /boot/device.hints and comment out the sio hints like this > > hint.vga.0.at="isa" > hint.sc.0.at="isa" > hint.sc.0.flags="0x100" > #hint.sio.0.at="isa" > #hint.sio.0.port="0x3F8" > #hint.sio.0.flags="0x10" > #hint.sio.0.irq="4" > #hint.sio.1.at="isa" > #hint.sio.1.port="0x2F8" > #hint.sio.1.irq="3" > #hint.sio.2.at="isa" > #hint.sio.2.disabled="1" > #hint.sio.2.port="0x3E8" > #hint.sio.2.irq="5" > #hint.sio.3.at="isa" > #hint.sio.3.disabled="1" > #hint.sio.3.port="0x2E8" > #hint.sio.3.irq="9" > hint.ppc.0.at="isa" > hint.ppc.0.irq="7" > > then boot the server off the local disks , the server boots correctly. > > The odd thing was, I rebuilt a debug 7.3-RELEASE amd64 kernel on > another working server, and installed it on the broken server and > booted it off the local disks, with out any changes to the hints file > and the server booted correctly and I was able to manually break out > into the debugger , but nothing looked wrong . The sio(4) driver has been deprecated in RELENG_8, which uses uart(4). uart(4) is better in a lot of regards, and should also be available for use on RELENG_7 but you'll need to adjust /etc/ttys to refer to the new device names (ttyuX vs. ttydX), plus add the uart entries to /boot/device.hints. I'm mentioning this as a workaround. Also worth considering is that the sio(4) ISA probe may be touching something Bad(tm) as a result, so you might try adding the following lines to your loader.conf (not a typo) to disable sio(4) entries entirely: hint.sio.0.disabled="1" hint.sio.1.disabled="1" And see if that improves things. If it does, remove the sio.1.disabled entry and see if that suffices. > So to sum this up there is something broken in 7.3-RELEASE but I cant > figure out what. This server works with a generic install of > 7.1-RELEASE 7.2-RELEASE , 6.1-RELEASE, 6.2-RELEASE and 6.4-RELEASE in > both amd64 and i386 , but not 7.3-RELEASE in amd64 . It also worked in > 7.4-RC1 . > > avg recommended I see what changed from r212964 to r212994 I am > currently looking into this . Has anyone seen this before ? If the server works fine with 7.4-PRERELEASE/RC1, why are you caring about 7.3? Upgrade. :-) -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110111021316.GA84376>