From owner-freebsd-stable@FreeBSD.ORG Mon Apr 2 23:27:14 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6F20810656A5; Mon, 2 Apr 2012 23:27:14 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id 284458FC16; Mon, 2 Apr 2012 23:27:13 +0000 (UTC) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 02 Apr 2012 16:27:20 -0700 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.4/8.14.4) with ESMTP id q32NRDLg070100; Mon, 2 Apr 2012 16:27:13 -0700 (PDT) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.4/8.14.4/Submit) id q32NRDG7070099; Mon, 2 Apr 2012 16:27:13 -0700 (PDT) (envelope-from ambrisko) From: Doug Ambrisko Message-Id: <201204022327.q32NRDG7070099@ambrisko.com> In-Reply-To: To: jhb@freebsd.org Date: Mon, 2 Apr 2012 16:27:13 -0700 (PDT) X-Mailer: ELM [version 2.4ME+ PL124d (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="US-ASCII" Cc: freebsd-stable@freebsd.org, John Baldwin Subject: Re: [stable-ish 9] Dell R815 ipmi(4) attach failure X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 23:27:14 -0000 Doug Ambrisko writes: | John Baldwin writes: | | On Saturday, March 31, 2012 3:25:48 pm Doug Ambrisko wrote: | | > Sean Bruno writes: | | > | Noting a failure to attach to the onboard IPMI controller with this dell | | > | R815. Not sure what to start poking at and thought I'd though this over | | > | here for comment. | | > | | | > | -bash-4.2$ dmesg |grep ipmi | | > | ipmi0: KCS mode found at io 0xca8 on acpi | | > | ipmi1: on isa0 | | > | device_attach: ipmi1 attach returned 16 | | > | ipmi1: on isa0 | | > | device_attach: ipmi1 attach returned 16 | | > | ipmi0: Timed out waiting for GET_DEVICE_ID | | > | | > I've run into this recently. A quick hack to fix it is: | | > | | > Index: ipmi.c | | > =================================================================== | | > RCS file: /cvs/src/sys/dev/ipmi/ipmi.c,v | | > retrieving revision 1.14 | | > diff -u -p -r1.14 ipmi.c | | > --- ipmi.c 14 Apr 2011 07:14:22 -0000 1.14 | | > +++ ipmi.c 31 Mar 2012 19:18:35 -0000 | | > @@ -695,7 +695,6 @@ ipmi_startup(void *arg) | | > if (error == EWOULDBLOCK) { | | > device_printf(dev, "Timed out waiting for GET_DEVICE_ID\n"); | | > ipmi_free_request(req); | | > - return; | | > } else if (error) { | | > device_printf(dev, "Failed GET_DEVICE_ID: %d\n", error); | | > ipmi_free_request(req); | | > | | > The issue is that the wakeup doesn't actually wake up the msleep | | > in ipmi_submit_driver_request. The error being reported is that | | > the msleep timed out. This doesn't seem to be critical problem | | > since after this things seemed to work work. I saw this on 9.X. | | > Haven't seen it on 8.2. Not sure about -current. | | > | | > It doesn't happen on all machines. | | | | Hmm, are you seeing the KCS thread manage the request but the wakeup() is | | lost? | | It was a couple of weeks ago that I played with it. I put printf's | around the msleep and wakeup. I saw the wakeup called but the sleep | not get it. I can try the test again later today. Right now my main | work machine is recovering from a power outage. This was with 9.0 | when I first saw it. This issue seems to only happen at boot time. | If I kldload the module after the system is booted then it seems to work | okay. The KCS part was working fine and got the data okay from the | request. I haven't seen or heard any issues with 8.2. With -current I patched ipmi.c with: Index: ipmi.c =================================================================== --- ipmi.c (revision 233806) +++ ipmi.c (working copy) @@ -523,7 +523,11 @@ * waiter that we awaken. */ if (req->ir_owner == NULL) +{ +device_printf(sc->ipmi_dev, "DEBUG %s %d before wakeup %d\n",__FUNCTION__,__LINE__,ticks); wakeup(req); +device_printf(sc->ipmi_dev, "DEBUG %s %d after wakeup %d\n",__FUNCTION__,__LINE__,ticks); +} else { dev = req->ir_owner; TAILQ_INSERT_TAIL(&dev->ipmi_completed_requests, req, ir_link); @@ -543,7 +547,11 @@ IPMI_LOCK(sc); error = sc->ipmi_enqueue_request(sc, req); if (error == 0) +{ +device_printf(sc->ipmi_dev, "DEBUG %s %d before msleep %d\n",__FUNCTION__,__LINE__,ticks); error = msleep(req, &sc->ipmi_lock, 0, "ipmireq", timo); +device_printf(sc->ipmi_dev, "DEBUG %s %d after msleep %d\n",__FUNCTION__,__LINE__,ticks); +} if (error == 0) error = req->ir_error; IPMI_UNLOCK(sc); @@ -695,8 +703,11 @@ error = ipmi_submit_driver_request(sc, req, MAX_TIMEOUT); if (error == EWOULDBLOCK) { device_printf(dev, "Timed out waiting for GET_DEVICE_ID\n"); + printf("DJA\n"); +/* ipmi_free_request(req); return; +*/ } else if (error) { device_printf(dev, "Failed GET_DEVICE_ID: %d\n", error); ipmi_free_request(req); and get # dmesg | grep ipmi ipmi0: KCS mode found at io 0xca8 on acpi ipmi1: on isa0 device_attach: ipmi1 attach returned 16 ipmi1: on isa0 device_attach: ipmi1 attach returned 16 ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2 ipmi0: DEBUG ipmi_complete_request 527 before wakeup 6201 ipmi0: DEBUG ipmi_complete_request 529 after wakeup 6263 ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 6323 ipmi0: Timed out waiting for GET_DEVICE_ID ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0 ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 6503 ipmi0: DEBUG ipmi_complete_request 527 before wakeup 6620 ipmi0: DEBUG ipmi_complete_request 529 after wakeup 6682 ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 6742 ... # sysctl kern.clockrate kern.clockrate: { hz = 1000, tick = 1000, profhz = 8126, stathz = 127 } # The 6000+ ticks before calling wakeup looks to be a problem. Maybe a few things are getting delayed for a while. On a working machine I see: # dmesg | grep ipmi ipmi0: port 0xca2,0xca3 on acpi0 ipmi0: KCS mode found at io 0xca2 on acpi ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 1 ipmi0: DEBUG ipmi_complete_request 527 before wakeup 397 ipmi0: DEBUG ipmi_complete_request 529 after wakeup 464 ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 539 ipmi0: IPMI device rev. 1, firmware rev. 2.03, version 2.0 ... if I kldunload/kldload on the problem machine I see: ipmi0: KCS mode found at io 0xca8 on acpi ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 675117 ipmi0: DEBUG ipmi_complete_request 527 before wakeup 675189 ipmi0: DEBUG ipmi_complete_request 529 after wakeup 675253 ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 675315 ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0 ... Another machine that fails: ipmi0: port 0xca8,0xcac on acpi0 ipmi0: KCS mode found at io 0xca8 on acpi ipmi0: KCS error: ff ipmi1: on isa0 device_attach: ipmi1 attach returned 16 ipmi1: on isa0 device_attach: ipmi1 attach returned 16 ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 0 ipmi0: DEBUG ipmi_complete_request 527 before wakeup 3046 ipmi0: DEBUG ipmi_complete_request 529 after wakeup 3168 ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 3234 ipmi0: Timed out waiting for GET_DEVICE_ID ipmi0: IPMI device rev. 0, firmware rev. 1.70, version 1.5 ... FWIW, Dell machine are tending to have this problem. I wonder, if the multiple probes are messing something up. Thanks, Doug A.