From owner-freebsd-stable@FreeBSD.ORG Fri Jun 14 01:31:36 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 39271483 for ; Fri, 14 Jun 2013 01:31:36 +0000 (UTC) (envelope-from bryce@bryce.net) Received: from mail-oa0-x22d.google.com (mail-oa0-x22d.google.com [IPv6:2607:f8b0:4003:c02::22d]) by mx1.freebsd.org (Postfix) with ESMTP id 060171747 for ; Fri, 14 Jun 2013 01:31:35 +0000 (UTC) Received: by mail-oa0-f45.google.com with SMTP id j1so41162oag.18 for ; Thu, 13 Jun 2013 18:31:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=B48M/QV2rl1Csd6LLi6/wirI9uvdFVEMAdt2d3sDF9o=; b=Btzr9mCLWAr0JxARH64fIWRqbsQaLXL4Tq49ww0g2VYIb4iyDmgydxtJt4RGu5y4YK +R1O155HdwiF2Mmf25zVNrYm1KqdM4sc44yHQAuXFLyfqoOEGQCzobZ/UHo+S/0zKNxJ cJqLNdWWRm6iJkTqe6n+7CPOCpcohgR0x95jSltJXavy1nw1AzobYySmmbGafW6Bup/Z wSUmMyP0CoKi6CeeuswBd6DZwatjubXzuqAICi1jt1Y8qvBmquAGl9aozI4uuHmP2Cvw pEkmQPWivYsLn58v0Qr3aCV2DnZq2PP3lHfqIlFkceW1ypgBHkMo34tyUsCXmOe8uvdN k3qg== MIME-Version: 1.0 X-Received: by 10.182.34.164 with SMTP id a4mr84366obj.43.1371173495377; Thu, 13 Jun 2013 18:31:35 -0700 (PDT) Received: by 10.60.77.105 with HTTP; Thu, 13 Jun 2013 18:31:35 -0700 (PDT) In-Reply-To: <20130613225001.GA52157@icarus.home.lan> References: <20130610143507.GA66619@icarus.home.lan> <201306101219.01193.jhb@freebsd.org> <20130611023229.GA78926@icarus.home.lan> <20130613225001.GA52157@icarus.home.lan> Date: Thu, 13 Jun 2013 20:31:35 -0500 Message-ID: Subject: Re: ACPI Warning, then hang From: Bryce Edwards To: Jeremy Chadwick Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmxlyj/U4XcYLQErEViDboaVzND33m3LOD5y62wcFxWoUVovKHD64L2ygyZtCBSd44irtwO Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jun 2013 01:31:36 -0000 On Thu, Jun 13, 2013 at 5:50 PM, Jeremy Chadwick wrote: > On Thu, Jun 13, 2013 at 05:32:21PM -0500, Bryce Edwards wrote: >> On Mon, Jun 10, 2013 at 9:32 PM, Jeremy Chadwick wrote: >> > On Mon, Jun 10, 2013 at 09:18:47PM -0500, Bryce Edwards wrote: >> >> Verbose boot: >> >> >> >> https://www.dropbox.com/s/obm8rtavro68ea8/acpi-verbose.jpg >> >> >> >> >> >> On Mon, Jun 10, 2013 at 11:27 AM, Bryce Edwards wrote: >> >> > On Mon, Jun 10, 2013 at 11:19 AM, John Baldwin wrote: >> >> >> On Monday, June 10, 2013 10:35:07 am Jeremy Chadwick wrote: >> >> >>> On Mon, Jun 10, 2013 at 09:18:14AM -0500, Bryce Edwards wrote: >> >> >>> > I'm getting the following warning, and then the system locks: >> >> >>> > >> >> >>> > ACPI Warning: Incorrect checksum in table [(bunch of spaces)] - 0x29, >> >> >>> > should be 0x48 >> >> >>> > >> >> >>> > Here's a pic: http://db.tt/O6dxONzI >> >> >>> > >> >> >>> > System is on a SuperMicro C7X58 motherboard that I just upgraded to >> >> >>> > BIOS 2.0a, which I would like to stay on if possible. I tried >> >> >>> > adjusting all the ACPI related BIOS settings without success. >> >> >>> >> >> >>> The message in question refers to hard-coded data in one of the many >> >> >>> ACPI tables (see acpidump(8) for the list -- there are many). ACPI >> >> >>> tables are stored within the BIOS -- the motherboard/BIOS vendor has >> >> >>> full control over all of them and is fully 100% responsible for their >> >> >>> content. >> >> >>> >> >> >>> It looks to me like they severely botched their BIOS, or somehow it got >> >> >>> flashed wrong. >> >> >>> >> >> >>> You need to contact Supermicro Technical Support and tell them of the >> >> >>> problem. They need to either fix their BIOS, or help figure out what's >> >> >>> become corrupted. You can point them to this thread if you'd like. >> >> >>> >> >> >>> I should note that the corruption/issue is major enough that you are >> >> >>> missing very key/important lines from your dmesg (after "avail memory" >> >> >>> but before "kdbX at kdbmuxX", which come from pure reliance upon ACPI. >> >> >>> Lines such as: >> >> >>> >> >> >>> Event timer "LAPIC" quality 400 >> >> >>> ACPI APIC Table: >> >> >>> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs >> >> >>> FreeBSD/SMP: 1 package(s) x 4 core(s) >> >> >>> cpu0 (BSP): APIC ID: 0 >> >> >>> cpu1 (AP): APIC ID: 1 >> >> >>> cpu2 (AP): APIC ID: 2 >> >> >>> cpu3 (AP): APIC ID: 3 >> >> >>> ioapic0 irqs 0-23 on motherboard >> >> >>> ioapic1 irqs 24-47 on motherboard >> >> >>> >> >> >>> In the meantime, you can try booting without ACPI support (there should >> >> >>> be a boot-up menu option for that) and pray that works. If it doesn't, >> >> >>> then your workaround is to roll back to an older BIOS version and/or put >> >> >>> pressure on Supermicro. You will find their Technical Support folks are >> >> >>> quite helpful/responsive to technical issues. >> >> >>> >> >> >>> Good luck and keep us posted on what transpires. >> >> >> >> >> >> Actually, that message is mostly harmless. All sorts of vendors ship >> >> >> tables with busted checksums that are in fact fine. :( However, the table >> >> >> name looks very odd which is more worrying. Booting without ACPI enabled >> >> >> would be a good first step. Trying a verbose boot to capture the last >> >> >> message before the hang would also be useful. >> >> >> >> >> >> -- >> >> >> John Baldwin >> >> > >> >> > Booting without ACPI did not work for me, although I might be able to >> >> > hack away at lots of BIOS setting to make it work. It didn't assign >> >> > IRQ's to things like the storage controller, etc. soI thought it was >> >> > probably not worth the effort. >> >> > >> >> > I did contact SuperMicro support as well, so we'll see what they have to say. >> >> > >> >> > I'll get a verbose boot posted up in a bit. >> > >> > A screenshot of a verbose boot is insufficient; as I'm sure you noticed >> > there are pages upon pages of information before the lock-up/crash. >> > Those pages are what folks are interested in. >> > >> > Because the system is hung, I doubt hitting Scroll Lock + using >> > PageUp/PageDown to go through the kernel message scrollback will work. >> > >> > You're going to need a serial-based console (i.e. hook something up to >> > COM1 on the motherboard, and get a null modem cable to connect to >> > another system where you use a serial port/terminal emulator (ex. PuTTY >> > for Windows, etc.) that has a scrollback buffer which you can copy-paste >> > or save. Set your serial port for 9600 baud, 8 bits, no parity, and 1 >> > stop bit (9600bps, 8N1). You'll need to have physical access to both >> > systems simultaneously. >> > >> > At the VGA console, boot FreeBSD then escape to the loader prompt >> > ("ok") and issue the following commands: >> > >> > set boot_multicons="YES" >> > set boot_serial="YES" >> > set console="comconsole,vidconsole" >> > boot >> > >> > You should begin seeing output on the serial port, and the system will >> > eventually hang/etc.. Then provide the captured output from the serial >> > port here. :-) >> > >> > -- >> > | Jeremy Chadwick jdc@koitsu.org | >> > | UNIX Systems Administrator http://jdc.koitsu.org/ | >> > | Making life hard for others since 1977. PGP 4BD6C0CB | >> > >> >> I'm having a heck of a time getting the serial console working... > > Come to think of it, depending on "how" they implement the interrupt > tie-ins for that (even with classic LPC/ISA, re: the whole IRQ 3/4 > thing), that might not even work given the BIOS behaviour seen here. > I'm grasping at straws at this one though, as there are literally > hundreds of possibilities why "serial console doesn't work". > >> FWIW, I'm getting the following when trying to boot into the most >> recent snapshot (memstick) from -current: >> >> https://dl.dropboxusercontent.com/u/141097/acpi-10-boot.jpg > > I believe what's shown there is just an effect of said malformed ACPI > tables or busted BIOS. Details about the APIC setup (used for mapping a > device to an interrupt) come from ACPI tables. Reference: > > http://en.wikipedia.org/wiki/Intel_APIC_Architecture#Problems > > If this is really key/mission critical, rolling back to a previous BIOS > is really your best choice. > > Has Supermicro Technical Support gotten back to you? If so, what have > they said? > > -- > | Jeremy Chadwick jdc@koitsu.org | > | UNIX Systems Administrator http://jdc.koitsu.org/ | > | Making life hard for others since 1977. PGP 4BD6C0CB | > OK, I'm back up & running... I got the previous BIOS back in place for now. I'll work with SuperMicro support to see if there's anything to be done, or at least to feed them details for a future BIOS rev. If anyone else on this list is running a SuperMicro C7X58 motherboard, let me know. Bryce