From owner-freebsd-stable@FreeBSD.ORG Thu Jun 13 22:50:23 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 80F7B8F5 for ; Thu, 13 Jun 2013 22:50:23 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net [217.70.183.196]) by mx1.freebsd.org (Postfix) with ESMTP id 27EF8187F for ; Thu, 13 Jun 2013 22:50:23 +0000 (UTC) Received: from mfilter11-d.gandi.net (mfilter11-d.gandi.net [217.70.178.131]) by relay4-d.mail.gandi.net (Postfix) with ESMTP id 70767172077; Fri, 14 Jun 2013 00:50:06 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter11-d.gandi.net Received: from relay4-d.mail.gandi.net ([217.70.183.196]) by mfilter11-d.gandi.net (mfilter11-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id cYivE89nvANv; Fri, 14 Jun 2013 00:50:04 +0200 (CEST) X-Originating-IP: 76.102.14.35 Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net [76.102.14.35]) (Authenticated sender: jdc@koitsu.org) by relay4-d.mail.gandi.net (Postfix) with ESMTPSA id 1886F172067; Fri, 14 Jun 2013 00:50:03 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id 0418373A1C; Thu, 13 Jun 2013 15:50:02 -0700 (PDT) Date: Thu, 13 Jun 2013 15:50:02 -0700 From: Jeremy Chadwick To: Bryce Edwards Subject: Re: ACPI Warning, then hang Message-ID: <20130613225001.GA52157@icarus.home.lan> References: <20130610143507.GA66619@icarus.home.lan> <201306101219.01193.jhb@freebsd.org> <20130611023229.GA78926@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2013 22:50:23 -0000 On Thu, Jun 13, 2013 at 05:32:21PM -0500, Bryce Edwards wrote: > On Mon, Jun 10, 2013 at 9:32 PM, Jeremy Chadwick wrote: > > On Mon, Jun 10, 2013 at 09:18:47PM -0500, Bryce Edwards wrote: > >> Verbose boot: > >> > >> https://www.dropbox.com/s/obm8rtavro68ea8/acpi-verbose.jpg > >> > >> > >> On Mon, Jun 10, 2013 at 11:27 AM, Bryce Edwards wrote: > >> > On Mon, Jun 10, 2013 at 11:19 AM, John Baldwin wrote: > >> >> On Monday, June 10, 2013 10:35:07 am Jeremy Chadwick wrote: > >> >>> On Mon, Jun 10, 2013 at 09:18:14AM -0500, Bryce Edwards wrote: > >> >>> > I'm getting the following warning, and then the system locks: > >> >>> > > >> >>> > ACPI Warning: Incorrect checksum in table [(bunch of spaces)] - 0x29, > >> >>> > should be 0x48 > >> >>> > > >> >>> > Here's a pic: http://db.tt/O6dxONzI > >> >>> > > >> >>> > System is on a SuperMicro C7X58 motherboard that I just upgraded to > >> >>> > BIOS 2.0a, which I would like to stay on if possible. I tried > >> >>> > adjusting all the ACPI related BIOS settings without success. > >> >>> > >> >>> The message in question refers to hard-coded data in one of the many > >> >>> ACPI tables (see acpidump(8) for the list -- there are many). ACPI > >> >>> tables are stored within the BIOS -- the motherboard/BIOS vendor has > >> >>> full control over all of them and is fully 100% responsible for their > >> >>> content. > >> >>> > >> >>> It looks to me like they severely botched their BIOS, or somehow it got > >> >>> flashed wrong. > >> >>> > >> >>> You need to contact Supermicro Technical Support and tell them of the > >> >>> problem. They need to either fix their BIOS, or help figure out what's > >> >>> become corrupted. You can point them to this thread if you'd like. > >> >>> > >> >>> I should note that the corruption/issue is major enough that you are > >> >>> missing very key/important lines from your dmesg (after "avail memory" > >> >>> but before "kdbX at kdbmuxX", which come from pure reliance upon ACPI. > >> >>> Lines such as: > >> >>> > >> >>> Event timer "LAPIC" quality 400 > >> >>> ACPI APIC Table: > >> >>> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > >> >>> FreeBSD/SMP: 1 package(s) x 4 core(s) > >> >>> cpu0 (BSP): APIC ID: 0 > >> >>> cpu1 (AP): APIC ID: 1 > >> >>> cpu2 (AP): APIC ID: 2 > >> >>> cpu3 (AP): APIC ID: 3 > >> >>> ioapic0 irqs 0-23 on motherboard > >> >>> ioapic1 irqs 24-47 on motherboard > >> >>> > >> >>> In the meantime, you can try booting without ACPI support (there should > >> >>> be a boot-up menu option for that) and pray that works. If it doesn't, > >> >>> then your workaround is to roll back to an older BIOS version and/or put > >> >>> pressure on Supermicro. You will find their Technical Support folks are > >> >>> quite helpful/responsive to technical issues. > >> >>> > >> >>> Good luck and keep us posted on what transpires. > >> >> > >> >> Actually, that message is mostly harmless. All sorts of vendors ship > >> >> tables with busted checksums that are in fact fine. :( However, the table > >> >> name looks very odd which is more worrying. Booting without ACPI enabled > >> >> would be a good first step. Trying a verbose boot to capture the last > >> >> message before the hang would also be useful. > >> >> > >> >> -- > >> >> John Baldwin > >> > > >> > Booting without ACPI did not work for me, although I might be able to > >> > hack away at lots of BIOS setting to make it work. It didn't assign > >> > IRQ's to things like the storage controller, etc. soI thought it was > >> > probably not worth the effort. > >> > > >> > I did contact SuperMicro support as well, so we'll see what they have to say. > >> > > >> > I'll get a verbose boot posted up in a bit. > > > > A screenshot of a verbose boot is insufficient; as I'm sure you noticed > > there are pages upon pages of information before the lock-up/crash. > > Those pages are what folks are interested in. > > > > Because the system is hung, I doubt hitting Scroll Lock + using > > PageUp/PageDown to go through the kernel message scrollback will work. > > > > You're going to need a serial-based console (i.e. hook something up to > > COM1 on the motherboard, and get a null modem cable to connect to > > another system where you use a serial port/terminal emulator (ex. PuTTY > > for Windows, etc.) that has a scrollback buffer which you can copy-paste > > or save. Set your serial port for 9600 baud, 8 bits, no parity, and 1 > > stop bit (9600bps, 8N1). You'll need to have physical access to both > > systems simultaneously. > > > > At the VGA console, boot FreeBSD then escape to the loader prompt > > ("ok") and issue the following commands: > > > > set boot_multicons="YES" > > set boot_serial="YES" > > set console="comconsole,vidconsole" > > boot > > > > You should begin seeing output on the serial port, and the system will > > eventually hang/etc.. Then provide the captured output from the serial > > port here. :-) > > > > -- > > | Jeremy Chadwick jdc@koitsu.org | > > | UNIX Systems Administrator http://jdc.koitsu.org/ | > > | Making life hard for others since 1977. PGP 4BD6C0CB | > > > > I'm having a heck of a time getting the serial console working... Come to think of it, depending on "how" they implement the interrupt tie-ins for that (even with classic LPC/ISA, re: the whole IRQ 3/4 thing), that might not even work given the BIOS behaviour seen here. I'm grasping at straws at this one though, as there are literally hundreds of possibilities why "serial console doesn't work". > FWIW, I'm getting the following when trying to boot into the most > recent snapshot (memstick) from -current: > > https://dl.dropboxusercontent.com/u/141097/acpi-10-boot.jpg I believe what's shown there is just an effect of said malformed ACPI tables or busted BIOS. Details about the APIC setup (used for mapping a device to an interrupt) come from ACPI tables. Reference: http://en.wikipedia.org/wiki/Intel_APIC_Architecture#Problems If this is really key/mission critical, rolling back to a previous BIOS is really your best choice. Has Supermicro Technical Support gotten back to you? If so, what have they said? -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB |