From owner-freebsd-current@freebsd.org Mon May 2 22:18:00 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4E910B2B769 for ; Mon, 2 May 2016 22:18:00 +0000 (UTC) (envelope-from vangyzen@FreeBSD.org) Received: from smtp.vangyzen.net (hotblack.vangyzen.net [199.48.133.146]) by mx1.freebsd.org (Postfix) with ESMTP id 2D12D1497; Mon, 2 May 2016 22:17:59 +0000 (UTC) (envelope-from vangyzen@FreeBSD.org) Received: from sweettea.beer.town (unknown [76.164.8.130]) by smtp.vangyzen.net (Postfix) with ESMTPSA id 881DF56A7F; Mon, 2 May 2016 17:17:52 -0500 (CDT) Subject: Re: Kernel panic from recent build To: Bill O'Hanlon , John Baldwin References: <1616736.1pUkklcWcu@ralph.baldwin.cx> Cc: freebsd-current@freebsd.org From: Eric van Gyzen Message-ID: <5727D20D.9090502@FreeBSD.org> Date: Mon, 2 May 2016 17:17:49 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 May 2016 22:18:00 -0000 On 05/02/2016 16:14, Bill O'Hanlon wrote: > On Mon, May 2, 2016 at 3:55 PM, John Baldwin wrote: > >> On Monday, May 02, 2016 01:35:54 PM Bill O'Hanlon wrote: >>> ​ >>> IMG_20160502_130335.jpg >>> < >> https://drive.google.com/file/d/1dtJxTwWXfhXVUUtn1Vvpzh3laJt7AILyCg/view?usp=drive_web >>> ​ >>> I'm getting the following panic from a recent (May 2, 2016) build. >>> panic: Duplicate local APIC ID 0 >>> >>> The system is a Dell Precision T5500 with generic factory BIOS settings. >>> It has run previous builds without event for several years. >>> >>> I'm attaching a link to a photo of the screen for added details. >> Try setting 'hint.srat.0.disabled=1' at the loader prompt and then grab >> the output of 'acpidump -t' on your next boot. The SRAT table used by >> the NUMA code appears to be corrupted by your BIOS. >> >> -- >> John Baldwin >> > > That allowed me to boot. I'm attaching the output of 'acpidump -t'. > Thanks! Bill, Do you have the time and interest to test this patch? If so, remove the line that you added to /boot/loader.conf so the patch actually gets exercised. Eric diff --git a/sys/x86/acpica/srat.c b/sys/x86/acpica/srat.c index 85f1922..1d0f73d 100644 --- a/sys/x86/acpica/srat.c +++ b/sys/x86/acpica/srat.c @@ -201,8 +201,12 @@ srat_parse_entry(ACPI_SUBTABLE_HEADER *entry, void *arg) "enabled" : "disabled"); if (!(cpu->Flags & ACPI_SRAT_CPU_ENABLED)) break; - KASSERT(!cpus[cpu->ApicId].enabled, - ("Duplicate local APIC ID %u", cpu->ApicId)); + if (cpus[cpu->ApicId].enabled) { + printf("SRAT: Duplicate local APIC ID %u\n", + cpu->ApicId); + *(int *)arg = ENXIO; + break; + } cpus[cpu->ApicId].domain = domain; cpus[cpu->ApicId].enabled = 1; break;