From owner-freebsd-bugs@FreeBSD.ORG Sun Nov 14 20:20:09 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 33647106566C for ; Sun, 14 Nov 2010 20:20:09 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id E9A778FC19 for ; Sun, 14 Nov 2010 20:20:08 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAEKK8DK027505 for ; Sun, 14 Nov 2010 20:20:08 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oAEKK83C027504; Sun, 14 Nov 2010 20:20:08 GMT (envelope-from gnats) Resent-Date: Sun, 14 Nov 2010 20:20:08 GMT Resent-Message-Id: <201011142020.oAEKK83C027504@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Loic Pefferkorn Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5EA92106564A for ; Sun, 14 Nov 2010 20:11:19 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id 4CFCB8FC18 for ; Sun, 14 Nov 2010 20:11:19 +0000 (UTC) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id oAEKBIIh018833 for ; Sun, 14 Nov 2010 20:11:18 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id oAEKBIAH018826; Sun, 14 Nov 2010 20:11:18 GMT (envelope-from nobody) Message-Id: <201011142011.oAEKBIAH018826@www.freebsd.org> Date: Sun, 14 Nov 2010 20:11:18 GMT From: Loic Pefferkorn To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/152250: [patch] Kernel panic when hw.ciss.expose_hidden_physical is set X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Nov 2010 20:20:09 -0000 >Number: 152250 >Category: kern >Synopsis: [patch] Kernel panic when hw.ciss.expose_hidden_physical is set >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Nov 14 20:20:08 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Loic Pefferkorn >Release: 7.2-RELEASE >Organization: >Environment: FreeBSD squeak.estat 7.2-STABLE FreeBSD 7.2-STABLE #5: Sun Nov 14 20:35:21 CET 2010 root@squeak.estat:/usr/obj/usr/src/sys/GENERIC amd64 >Description: HP ProLiant DL360 G6 server with an HP StorageWorks MSL4048 Tape Library # grep ciss /boot/loader.conf hw.ciss.expose_hidden_physical=1 When the tunable hw.ciss.expose_hidden_physical is set at boot time, I have a kernel panic: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x8 fault code = supervisor read data, page not present instruction pointer = 0x8:0xffffffff80201686 stack pointer = 0x10:0xffffff807c6ab930 frame pointer = 0x10:0x400 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 77 (sysctl) trap number = 12 panic: page fault cpuid = 0 Uptime: 6s Physical memory: 4073 MB Dumping 1230 MB: Backtrace from the core dump: (kgdb) bt #0 doadump () at pcpu.h:195 #1 0x0000000000000004 in ?? () #2 0xffffffff8054cff9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #3 0xffffffff8054d402 in panic (fmt=0x104
) at /usr/src/sys/kern/kern_shutdown.c:574 #4 0xffffffff80812563 in trap_fatal (frame=0xffffff0003eb4390, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:756 #5 0xffffffff80812935 in trap_pfault (frame=0xffffff807c6ab880, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:672 #6 0xffffffff80813274 in trap (frame=0xffffff807c6ab880) at /usr/src/sys/amd64/amd64/trap.c:443 #7 0xffffffff807fd2ce in calltrap () at /usr/src/sys/amd64/amd64/exception.S:218 #8 0xffffffff80201686 in acpi_child_pnpinfo_str_method (cbdev=Variable "cbdev" is not available. ) at /usr/src/sys/dev/acpica/acpi.c:850 #9 0xffffffff805753c9 in device_sysctl_handler (oidp=Variable "oidp" is not available. ) at /usr/src/sys/kern/subr_bus.c:260 #10 0xffffffff8055654f in sysctl_root (oidp=Variable "oidp" is not available. ) at /usr/src/sys/kern/kern_sysctl.c:1419 #11 0xffffffff805578c5 in userland_sysctl (td=0x0, name=0xffffff807c6abac0, namelen=4, old=0x0, oldlenp=Variable "oldlenp" is not available. ) at /usr/src/sys/kern/kern_sysctl.c:1522 #12 0xffffffff80557ad2 in __sysctl (td=0xffffff0003eb4390, uap=0xffffff807c6abbf0) at /usr/src/sys/kern/kern_sysctl.c:1449 #13 0xffffffff80812bb7 in syscall (frame=0xffffff807c6abc80) at /usr/src/sys/amd64/amd64/trap.c:899 #14 0xffffffff807fd4db in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:339 #15 0x0000000800719cac in ?? () Previous frame inner to this frame (corrupt stack?) Faulty instruction: (kgdb) x/i 0xffffffff80201686 0xffffffff80201686 : mov 0x8(%rbx),%edx >How-To-Repeat: With the same hardware, put hw.ciss.expose_hidden_physical=1 in loader.conf and reboot. >Fix: Last called function is acpi_child_pnpinfo_str_method in sys/dev/acpica/acpi.c static int acpi_child_pnpinfo_str_method(device_t cbdev, device_t child, char *buf, size_t buflen) { ACPI_BUFFER adbuf = {ACPI_ALLOCATE_BUFFER, NULL}; ACPI_DEVICE_INFO *adinfo; struct acpi_device *dinfo = device_get_ivars(child); char *end; int error; error = AcpiGetObjectInfo(dinfo->ad_handle, &adbuf); adinfo = (ACPI_DEVICE_INFO *) adbuf.Pointer; if (error) snprintf(buf, buflen, "unknown"); else snprintf(buf, buflen, "_HID=%s _UID=%lu", (adinfo->Valid & ACPI_VALID_HID) ? adinfo->HardwareId.Value : "none", (adinfo->Valid & ACPI_VALID_UID) ? strtoul(adinfo->UniqueId.Value, &end, 10) : 0); if (adinfo) AcpiOsFree(adinfo); return (0); } buf is modified accordingly to "error" value. I have found adbuf.Pointer to be set to 0x0 while "error" was set to a zero value. Therefore, references to adinfo struct in snprintf have 0x0 as base. "error" value is not set correctly. Let's see why in AcpiGetObjectInfo, in sys/contrib/dev/acpica/nsxfname.c Node = AcpiNsMapHandleToNode (Handle); if (!Node) { (void) AcpiUtReleaseMutex (ACPI_MTX_NAMESPACE); goto Cleanup; } (...) Cleanup: ACPI_FREE (Info); if (CidList) { ACPI_FREE (CidList); } return (Status); If AcpiNsMapHandleToNode fails, we release a mutex and go to Cleanup:, which does not update Status value before return. Status value hence is the one from AcpiUtAcquireMutex called earlier, which is wrong. Setting Status to AE_BAD_PARAMETER before going to Cleanup fix the issue (I found that AE_BAD_PARAMETER is used elsewhere in the kernel in similar flows when AcpiNsMapHandleToNode is called). 7.0 to 7.3 are affected, patch is attached. Hope I'm right :) Patch attached with submission follows: --- src/sys/contrib/dev/acpica/nsxfname.c.orig 2010-11-14 20:51:57.000000000 +0100 +++ src/sys/contrib/dev/acpica/nsxfname.c 2010-11-14 20:50:46.000000000 +0100 @@ -361,6 +361,7 @@ if (!Node) { (void) AcpiUtReleaseMutex (ACPI_MTX_NAMESPACE); + Status = AE_BAD_PARAMETER; goto Cleanup; } >Release-Note: >Audit-Trail: >Unformatted: