Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Jun 2013 20:31:35 -0500
From:      Bryce Edwards <bryce@bryce.net>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: ACPI Warning, then hang
Message-ID:  <CAO_ZHU8yyOPHUQ5Ha66Bb=vVcB%2BJ9m11uN0w13KU=RK=%2B3PqBw@mail.gmail.com>
In-Reply-To: <20130613225001.GA52157@icarus.home.lan>
References:  <CAO_ZHU-_J3s8qGKrbwrbckGQN2Dz-J3OUW8thiz6YWxt=2VLMg@mail.gmail.com> <20130610143507.GA66619@icarus.home.lan> <201306101219.01193.jhb@freebsd.org> <CAO_ZHU9_hdo1YaA3dpF4_82Y_hDUiSXpG_z1p0fOdBmiaz1QzQ@mail.gmail.com> <CAO_ZHU9VEC9bdYnyk9vdGB-hT9iUzOHH1w=zh72M3YHnR4voBQ@mail.gmail.com> <20130611023229.GA78926@icarus.home.lan> <CAO_ZHU9_GNpgySBNqLg30rESR%2B=SGu0Vm9f9oL5LRUpdxZ0GeQ@mail.gmail.com> <20130613225001.GA52157@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 13, 2013 at 5:50 PM, Jeremy Chadwick <jdc@koitsu.org> wrote:
> On Thu, Jun 13, 2013 at 05:32:21PM -0500, Bryce Edwards wrote:
>> On Mon, Jun 10, 2013 at 9:32 PM, Jeremy Chadwick <jdc@koitsu.org> wrote:
>> > On Mon, Jun 10, 2013 at 09:18:47PM -0500, Bryce Edwards wrote:
>> >> Verbose boot:
>> >>
>> >> https://www.dropbox.com/s/obm8rtavro68ea8/acpi-verbose.jpg
>> >>
>> >>
>> >> On Mon, Jun 10, 2013 at 11:27 AM, Bryce Edwards <bryce@bryce.net> wrote:
>> >> > On Mon, Jun 10, 2013 at 11:19 AM, John Baldwin <jhb@freebsd.org> wrote:
>> >> >> On Monday, June 10, 2013 10:35:07 am Jeremy Chadwick wrote:
>> >> >>> On Mon, Jun 10, 2013 at 09:18:14AM -0500, Bryce Edwards wrote:
>> >> >>> > I'm getting the following warning, and then the system locks:
>> >> >>> >
>> >> >>> > ACPI Warning: Incorrect checksum in table [(bunch of spaces)] - 0x29,
>> >> >>> > should be 0x48
>> >> >>> >
>> >> >>> > Here's a pic: http://db.tt/O6dxONzI
>> >> >>> >
>> >> >>> > System is on a SuperMicro C7X58 motherboard that I just upgraded to
>> >> >>> > BIOS 2.0a, which I would like to stay on if possible.  I tried
>> >> >>> > adjusting all the ACPI related BIOS settings without success.
>> >> >>>
>> >> >>> The message in question refers to hard-coded data in one of the many
>> >> >>> ACPI tables (see acpidump(8) for the list -- there are many).  ACPI
>> >> >>> tables are stored within the BIOS -- the motherboard/BIOS vendor has
>> >> >>> full control over all of them and is fully 100% responsible for their
>> >> >>> content.
>> >> >>>
>> >> >>> It looks to me like they severely botched their BIOS, or somehow it got
>> >> >>> flashed wrong.
>> >> >>>
>> >> >>> You need to contact Supermicro Technical Support and tell them of the
>> >> >>> problem.  They need to either fix their BIOS, or help figure out what's
>> >> >>> become corrupted.  You can point them to this thread if you'd like.
>> >> >>>
>> >> >>> I should note that the corruption/issue is major enough that you are
>> >> >>> missing very key/important lines from your dmesg (after "avail memory"
>> >> >>> but before "kdbX at kdbmuxX", which come from pure reliance upon ACPI.
>> >> >>> Lines such as:
>> >> >>>
>> >> >>> Event timer "LAPIC" quality 400
>> >> >>> ACPI APIC Table: <PTLTD        APIC  >
>> >> >>> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>> >> >>> FreeBSD/SMP: 1 package(s) x 4 core(s)
>> >> >>>  cpu0 (BSP): APIC ID:  0
>> >> >>>  cpu1 (AP): APIC ID:  1
>> >> >>>  cpu2 (AP): APIC ID:  2
>> >> >>>  cpu3 (AP): APIC ID:  3
>> >> >>> ioapic0 <Version 2.0> irqs 0-23 on motherboard
>> >> >>> ioapic1 <Version 2.0> irqs 24-47 on motherboard
>> >> >>>
>> >> >>> In the meantime, you can try booting without ACPI support (there should
>> >> >>> be a boot-up menu option for that) and pray that works.  If it doesn't,
>> >> >>> then your workaround is to roll back to an older BIOS version and/or put
>> >> >>> pressure on Supermicro.  You will find their Technical Support folks are
>> >> >>> quite helpful/responsive to technical issues.
>> >> >>>
>> >> >>> Good luck and keep us posted on what transpires.
>> >> >>
>> >> >> Actually, that message is mostly harmless.  All sorts of vendors ship
>> >> >> tables with busted checksums that are in fact fine. :(  However, the table
>> >> >> name looks very odd which is more worrying.  Booting without ACPI enabled
>> >> >> would be a good first step.  Trying a verbose boot to capture the last
>> >> >> message before the hang would also be useful.
>> >> >>
>> >> >> --
>> >> >> John Baldwin
>> >> >
>> >> > Booting without ACPI did not work for me, although I might be able to
>> >> > hack away at lots of BIOS setting to make it work.  It didn't assign
>> >> > IRQ's to things like the storage controller, etc. soI thought it was
>> >> > probably not worth the effort.
>> >> >
>> >> > I did contact SuperMicro support as well, so we'll see what they have to say.
>> >> >
>> >> > I'll get a verbose boot posted up in a bit.
>> >
>> > A screenshot of a verbose boot is insufficient; as I'm sure you noticed
>> > there are pages upon pages of information before the lock-up/crash.
>> > Those pages are what folks are interested in.
>> >
>> > Because the system is hung, I doubt hitting Scroll Lock + using
>> > PageUp/PageDown to go through the kernel message scrollback will work.
>> >
>> > You're going to need a serial-based console (i.e. hook something up to
>> > COM1 on the motherboard, and get a null modem cable to connect to
>> > another system where you use a serial port/terminal emulator (ex. PuTTY
>> > for Windows, etc.) that has a scrollback buffer which you can copy-paste
>> > or save.  Set your serial port for 9600 baud, 8 bits, no parity, and 1
>> > stop bit (9600bps, 8N1).  You'll need to have physical access to both
>> > systems simultaneously.
>> >
>> > At the VGA console, boot FreeBSD then escape to the loader prompt
>> > ("ok") and issue the following commands:
>> >
>> > set boot_multicons="YES"
>> > set boot_serial="YES"
>> > set console="comconsole,vidconsole"
>> > boot
>> >
>> > You should begin seeing output on the serial port, and the system will
>> > eventually hang/etc..  Then provide the captured output from the serial
>> > port here.  :-)
>> >
>> > --
>> > | Jeremy Chadwick                                   jdc@koitsu.org |
>> > | UNIX Systems Administrator                http://jdc.koitsu.org/ |
>> > | Making life hard for others since 1977.             PGP 4BD6C0CB |
>> >
>>
>> I'm having a heck of a time getting the serial console working...
>
> Come to think of it, depending on "how" they implement the interrupt
> tie-ins for that (even with classic LPC/ISA, re: the whole IRQ 3/4
> thing), that might not even work given the BIOS behaviour seen here.
> I'm grasping at straws at this one though, as there are literally
> hundreds of possibilities why "serial console doesn't work".
>
>> FWIW, I'm getting the following when trying to boot into the most
>> recent snapshot (memstick) from -current:
>>
>> https://dl.dropboxusercontent.com/u/141097/acpi-10-boot.jpg
>
> I believe what's shown there is just an effect of said malformed ACPI
> tables or busted BIOS.  Details about the APIC setup (used for mapping a
> device to an interrupt) come from ACPI tables.  Reference:
>
> http://en.wikipedia.org/wiki/Intel_APIC_Architecture#Problems
>
> If this is really key/mission critical, rolling back to a previous BIOS
> is really your best choice.
>
> Has Supermicro Technical Support gotten back to you?  If so, what have
> they said?
>
> --
> | Jeremy Chadwick                                   jdc@koitsu.org |
> | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> | Making life hard for others since 1977.             PGP 4BD6C0CB |
>

OK, I'm back up & running...  I got the previous BIOS back in place for now.

I'll work with SuperMicro support to see if there's anything to be
done, or at least to feed them details for a future BIOS rev.

If anyone else on this list is running a SuperMicro C7X58 motherboard,
let me know.

Bryce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAO_ZHU8yyOPHUQ5Ha66Bb=vVcB%2BJ9m11uN0w13KU=RK=%2B3PqBw>