Date: Thu, 13 Jun 2013 20:31:35 -0500 From: Bryce Edwards <bryce@bryce.net> To: Jeremy Chadwick <jdc@koitsu.org> Cc: freebsd-stable@freebsd.org Subject: Re: ACPI Warning, then hang Message-ID: <CAO_ZHU8yyOPHUQ5Ha66Bb=vVcB%2BJ9m11uN0w13KU=RK=%2B3PqBw@mail.gmail.com> In-Reply-To: <20130613225001.GA52157@icarus.home.lan> References: <CAO_ZHU-_J3s8qGKrbwrbckGQN2Dz-J3OUW8thiz6YWxt=2VLMg@mail.gmail.com> <20130610143507.GA66619@icarus.home.lan> <201306101219.01193.jhb@freebsd.org> <CAO_ZHU9_hdo1YaA3dpF4_82Y_hDUiSXpG_z1p0fOdBmiaz1QzQ@mail.gmail.com> <CAO_ZHU9VEC9bdYnyk9vdGB-hT9iUzOHH1w=zh72M3YHnR4voBQ@mail.gmail.com> <20130611023229.GA78926@icarus.home.lan> <CAO_ZHU9_GNpgySBNqLg30rESR%2B=SGu0Vm9f9oL5LRUpdxZ0GeQ@mail.gmail.com> <20130613225001.GA52157@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 13, 2013 at 5:50 PM, Jeremy Chadwick <jdc@koitsu.org> wrote: > On Thu, Jun 13, 2013 at 05:32:21PM -0500, Bryce Edwards wrote: >> On Mon, Jun 10, 2013 at 9:32 PM, Jeremy Chadwick <jdc@koitsu.org> wrote: >> > On Mon, Jun 10, 2013 at 09:18:47PM -0500, Bryce Edwards wrote: >> >> Verbose boot: >> >> >> >> https://www.dropbox.com/s/obm8rtavro68ea8/acpi-verbose.jpg >> >> >> >> >> >> On Mon, Jun 10, 2013 at 11:27 AM, Bryce Edwards <bryce@bryce.net> wrote: >> >> > On Mon, Jun 10, 2013 at 11:19 AM, John Baldwin <jhb@freebsd.org> wrote: >> >> >> On Monday, June 10, 2013 10:35:07 am Jeremy Chadwick wrote: >> >> >>> On Mon, Jun 10, 2013 at 09:18:14AM -0500, Bryce Edwards wrote: >> >> >>> > I'm getting the following warning, and then the system locks: >> >> >>> > >> >> >>> > ACPI Warning: Incorrect checksum in table [(bunch of spaces)] - 0x29, >> >> >>> > should be 0x48 >> >> >>> > >> >> >>> > Here's a pic: http://db.tt/O6dxONzI >> >> >>> > >> >> >>> > System is on a SuperMicro C7X58 motherboard that I just upgraded to >> >> >>> > BIOS 2.0a, which I would like to stay on if possible. I tried >> >> >>> > adjusting all the ACPI related BIOS settings without success. >> >> >>> >> >> >>> The message in question refers to hard-coded data in one of the many >> >> >>> ACPI tables (see acpidump(8) for the list -- there are many). ACPI >> >> >>> tables are stored within the BIOS -- the motherboard/BIOS vendor has >> >> >>> full control over all of them and is fully 100% responsible for their >> >> >>> content. >> >> >>> >> >> >>> It looks to me like they severely botched their BIOS, or somehow it got >> >> >>> flashed wrong. >> >> >>> >> >> >>> You need to contact Supermicro Technical Support and tell them of the >> >> >>> problem. They need to either fix their BIOS, or help figure out what's >> >> >>> become corrupted. You can point them to this thread if you'd like. >> >> >>> >> >> >>> I should note that the corruption/issue is major enough that you are >> >> >>> missing very key/important lines from your dmesg (after "avail memory" >> >> >>> but before "kdbX at kdbmuxX", which come from pure reliance upon ACPI. >> >> >>> Lines such as: >> >> >>> >> >> >>> Event timer "LAPIC" quality 400 >> >> >>> ACPI APIC Table: <PTLTD APIC > >> >> >>> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs >> >> >>> FreeBSD/SMP: 1 package(s) x 4 core(s) >> >> >>> cpu0 (BSP): APIC ID: 0 >> >> >>> cpu1 (AP): APIC ID: 1 >> >> >>> cpu2 (AP): APIC ID: 2 >> >> >>> cpu3 (AP): APIC ID: 3 >> >> >>> ioapic0 <Version 2.0> irqs 0-23 on motherboard >> >> >>> ioapic1 <Version 2.0> irqs 24-47 on motherboard >> >> >>> >> >> >>> In the meantime, you can try booting without ACPI support (there should >> >> >>> be a boot-up menu option for that) and pray that works. If it doesn't, >> >> >>> then your workaround is to roll back to an older BIOS version and/or put >> >> >>> pressure on Supermicro. You will find their Technical Support folks are >> >> >>> quite helpful/responsive to technical issues. >> >> >>> >> >> >>> Good luck and keep us posted on what transpires. >> >> >> >> >> >> Actually, that message is mostly harmless. All sorts of vendors ship >> >> >> tables with busted checksums that are in fact fine. :( However, the table >> >> >> name looks very odd which is more worrying. Booting without ACPI enabled >> >> >> would be a good first step. Trying a verbose boot to capture the last >> >> >> message before the hang would also be useful. >> >> >> >> >> >> -- >> >> >> John Baldwin >> >> > >> >> > Booting without ACPI did not work for me, although I might be able to >> >> > hack away at lots of BIOS setting to make it work. It didn't assign >> >> > IRQ's to things like the storage controller, etc. soI thought it was >> >> > probably not worth the effort. >> >> > >> >> > I did contact SuperMicro support as well, so we'll see what they have to say. >> >> > >> >> > I'll get a verbose boot posted up in a bit. >> > >> > A screenshot of a verbose boot is insufficient; as I'm sure you noticed >> > there are pages upon pages of information before the lock-up/crash. >> > Those pages are what folks are interested in. >> > >> > Because the system is hung, I doubt hitting Scroll Lock + using >> > PageUp/PageDown to go through the kernel message scrollback will work. >> > >> > You're going to need a serial-based console (i.e. hook something up to >> > COM1 on the motherboard, and get a null modem cable to connect to >> > another system where you use a serial port/terminal emulator (ex. PuTTY >> > for Windows, etc.) that has a scrollback buffer which you can copy-paste >> > or save. Set your serial port for 9600 baud, 8 bits, no parity, and 1 >> > stop bit (9600bps, 8N1). You'll need to have physical access to both >> > systems simultaneously. >> > >> > At the VGA console, boot FreeBSD then escape to the loader prompt >> > ("ok") and issue the following commands: >> > >> > set boot_multicons="YES" >> > set boot_serial="YES" >> > set console="comconsole,vidconsole" >> > boot >> > >> > You should begin seeing output on the serial port, and the system will >> > eventually hang/etc.. Then provide the captured output from the serial >> > port here. :-) >> > >> > -- >> > | Jeremy Chadwick jdc@koitsu.org | >> > | UNIX Systems Administrator http://jdc.koitsu.org/ | >> > | Making life hard for others since 1977. PGP 4BD6C0CB | >> > >> >> I'm having a heck of a time getting the serial console working... > > Come to think of it, depending on "how" they implement the interrupt > tie-ins for that (even with classic LPC/ISA, re: the whole IRQ 3/4 > thing), that might not even work given the BIOS behaviour seen here. > I'm grasping at straws at this one though, as there are literally > hundreds of possibilities why "serial console doesn't work". > >> FWIW, I'm getting the following when trying to boot into the most >> recent snapshot (memstick) from -current: >> >> https://dl.dropboxusercontent.com/u/141097/acpi-10-boot.jpg > > I believe what's shown there is just an effect of said malformed ACPI > tables or busted BIOS. Details about the APIC setup (used for mapping a > device to an interrupt) come from ACPI tables. Reference: > > http://en.wikipedia.org/wiki/Intel_APIC_Architecture#Problems > > If this is really key/mission critical, rolling back to a previous BIOS > is really your best choice. > > Has Supermicro Technical Support gotten back to you? If so, what have > they said? > > -- > | Jeremy Chadwick jdc@koitsu.org | > | UNIX Systems Administrator http://jdc.koitsu.org/ | > | Making life hard for others since 1977. PGP 4BD6C0CB | > OK, I'm back up & running... I got the previous BIOS back in place for now. I'll work with SuperMicro support to see if there's anything to be done, or at least to feed them details for a future BIOS rev. If anyone else on this list is running a SuperMicro C7X58 motherboard, let me know. Bryce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAO_ZHU8yyOPHUQ5Ha66Bb=vVcB%2BJ9m11uN0w13KU=RK=%2B3PqBw>