Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Jan 2011 11:21:12 -0500
From:      Mark Saad <nonesuch@longcount.org>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        stable@freebsd.org
Subject:   Re: Enabling DDB prevent kernel from panicing
Message-ID:  <AANLkTi=K9Fg1yHKsN5bAB3AptunU8LmRtnyKdd35aWBn@mail.gmail.com>
In-Reply-To: <AANLkTinHufV=EfUbizbs0HaotqC7tQ6=iUi1a%2BgASmr9@mail.gmail.com>
References:  <AANLkTinp76kxbRu6y0=Qfe9PiuDUPiUuU7zbQ24nkp8B@mail.gmail.com> <AANLkTimaaM6Vb-V4-yyocJKax9mFSQxtAJw5mEom=AC-@mail.gmail.com> <AANLkTimBf1BxuJK4h5WJmreqAo4BQz_TwAbrPvNwYqm=@mail.gmail.com> <20110111021316.GA84376@icarus.home.lan> <AANLkTinHufV=EfUbizbs0HaotqC7tQ6=iUi1a%2BgASmr9@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jan 10, 2011 at 10:29 PM, Mark Saad <nonesuch@longcount.org> wrote:
> On Mon, Jan 10, 2011 at 9:13 PM, Jeremy Chadwick
> <freebsd@jdc.parodius.com> wrote:
>> On Mon, Jan 10, 2011 at 07:42:21PM -0500, Mark Saad wrote:
>>> On Mon, Jan 10, 2011 at 6:59 PM, =C2=A0<nickolasbug@gmail.com> wrote:
>>> > Hello, Mark
>>> >
>>> > 2011/1/11 Mark Saad <nonesuch@longcount.org>:
>>> >> All
>>> >> This was originally posted to hackers@
>>> >>
>>> >> I have a good question that I cant find an answer for. I believe
>>> >> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-b=
it
>>> >> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: pag=
e
>>> >> fault while in kernel mode " . The hardware works fine in 7.2-RELEAS=
E
>>> >> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 .
>>> >>
>>> >> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using t=
he
>>> >> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if th=
is
>>> >> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC
>>> >> kernel using patches sources and tried to boot and I got the same
>>> >> crash.
>>> >>
>>> >> =C2=A0Next I rebuilt the kernel with KDB and DDB to see if I could g=
et a
>>> >> core-dump of the system. I also set loader.conf to
>>> >>
>>> >> kernel=3D"kernel.DEBUG"
>>> >> kern.dumpdev=3D"/dev/da0s1b"
>>> >>
>>> >> Next I pxebooted =C2=A0the box and the system does not crash on boot=
 up, it
>>> >> will easily load a nfs root and work fine. So I copied my debug
>>> >> kernel, and loader.conf to the local disk and rebooted and it boots
>>> >> fine from the local disk .
>>> >
>>> > Looks like a race condition.
>>> > Well, you don't need to compile KDB and DDB, just add
>>> >
>>> > makeoptions DEBUG=3D-g
>>> >
>>> > into your kernel config file and rebuild kernel.
>>> >
>>> > Then after you got a crash dump you can easy debug it (see FreeBSD
>>> > Developers Handbok):
>>> > http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-g=
db.html
>>> >
>>> >
>>> > wbr,
>>> > Nickolas
>>> >
>>>
>>> =C2=A0 Sorry let me clarify the issue, When you install a generic
>>> 7.3-RELEASE amd64 on some of the HP servers I use, the kernel panics
>>> in boot up
>>> when it probes the sio driver . Here is a part of my dmesg.boot file
>>>
>>> atkbd0: [ITHREAD]
>>> psm0: <PS/2 Mouse> irq 12 on atkbdc0
>>> psm0: [GIANT-LOCKED]
>>> psm0: [ITHREAD]
>>> psm0: model Generic PS/2 mouse, device ID 0
>>> sio0: configured irq 4 not in bitmap of probed irqs 0
>>> sio0: port may not be enabled
>>> sio0: configured irq 4 not in bitmap of probed irqs 0
>>> sio0: port may not be enabled
>>> sio0: <Standard PC COM port> port 0x3f8-0x3ff irq 4 on acpi0
>>> sio0: type 16550A
>>> sio0: [FILTER]
>>> Say about here in the boot up , is where the box crashes with the
>>> above noted error.
>>>
>>> If I then boot the same box off a 7.1-RELEASE amd64 netboot server ,
>>> mount the local disks of the 7.3-RELEASE install and edit the
>>> /boot/device.hints and comment out the sio hints like this
>>>
>>> hint.vga.0.at=3D"isa"
>>> hint.sc.0.at=3D"isa"
>>> hint.sc.0.flags=3D"0x100"
>>> #hint.sio.0.at=3D"isa"
>>> #hint.sio.0.port=3D"0x3F8"
>>> #hint.sio.0.flags=3D"0x10"
>>> #hint.sio.0.irq=3D"4"
>>> #hint.sio.1.at=3D"isa"
>>> #hint.sio.1.port=3D"0x2F8"
>>> #hint.sio.1.irq=3D"3"
>>> #hint.sio.2.at=3D"isa"
>>> #hint.sio.2.disabled=3D"1"
>>> #hint.sio.2.port=3D"0x3E8"
>>> #hint.sio.2.irq=3D"5"
>>> #hint.sio.3.at=3D"isa"
>>> #hint.sio.3.disabled=3D"1"
>>> #hint.sio.3.port=3D"0x2E8"
>>> #hint.sio.3.irq=3D"9"
>>> hint.ppc.0.at=3D"isa"
>>> hint.ppc.0.irq=3D"7"
>>>
>>> then boot the server off the local disks , the server boots correctly.
>>>
>>> The odd thing was, I rebuilt a debug 7.3-RELEASE amd64 kernel on
>>> another working server, and installed it on the broken server and
>>> booted it off the local disks, with out any changes to the hints file
>>> and the server booted correctly and I was able to manually break out
>>> into the debugger , but nothing looked wrong .
>>
>> The sio(4) driver has been deprecated in RELENG_8, which uses uart(4).
>> uart(4) is better in a lot of regards, and should also be available for
>> use on RELENG_7 but you'll need to adjust /etc/ttys to refer to the new
>> device names (ttyuX vs. ttydX), plus add the uart entries to
>> /boot/device.hints.
>>
> I found that too, and I was thinking about the change but its going to
> require a source build of the kernel to fix that along with a bunch of
> manual work
> on my side that =C2=A0I would rather not do .
>
>> I'm mentioning this as a workaround.
>>
>> Also worth considering is that the sio(4) ISA probe may be touching
>> something Bad(tm) as a result, so you might try adding the following
>> lines to your loader.conf (not a typo) to disable sio(4) entries
>> entirely:
>>
>> hint.sio.0.disabled=3D"1"
>> hint.sio.1.disabled=3D"1"
>>
>> And see if that improves things. =C2=A0If it does, remove the sio.1.disa=
bled
>> entry and see if that suffices.
>
> I'll try the hint disabling but how is that different from removing
> the hint outright ?
>
so adding the hint to the loader.conf worked .  my understanding of
how the loader's 4th bits work make me believe
we can use either file for this hint . but I am still unsure of why
the stock hint breaks the box, where as no hint works
and disabling port via hint works. the other thing is the port works
in its intended way with no hint or disabled hint.

>>
>>> So to sum this up there is something broken in 7.3-RELEASE but I cant
>>> figure out what. This server works with a generic install of
>>> 7.1-RELEASE 7.2-RELEASE , 6.1-RELEASE, 6.2-RELEASE and 6.4-RELEASE in
>>> both amd64 and i386 , but not 7.3-RELEASE in amd64 . It also worked in
>>> 7.4-RC1 .
>>>
>>> avg recommended I see what changed from r212964 =C2=A0to r212994 I am
>>> currently looking into this . Has anyone seen this before ?
>>
>> If the server works fine with 7.4-PRERELEASE/RC1, why are you caring
>> about 7.3? =C2=A0Upgrade. =C2=A0:-)
>>
>
> Can't just upgrade we did a bunch of work on 7.3-RELEASE and we are
> going to stay on 7.3-RELEASE until 2012 for various reasons.
>
>> --
>> | Jeremy Chadwick =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 jdc@paro=
dius.com |
>> | Parodius Networking =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 http://www.parodius.com/ |
>> | UNIX Systems Administrator =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0Mountain View, CA, USA |
>> | Making life hard for others since 1977. =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 PGP 4BD6C0CB |
>>
>>
>
> So anyone what to take a stab on flying with out a device.hints ?
>
>
> --
>
> mark saad | nonesuch@longcount.org
>


 I am still looking at what changed in 7.4 that could fix this.
--=20

mark saad | nonesuch@longcount.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=K9Fg1yHKsN5bAB3AptunU8LmRtnyKdd35aWBn>