Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Apr 2012 16:27:13 -0700 (PDT)
From:      Doug Ambrisko <ambrisko@ambrisko.com>
To:        jhb@freebsd.org
Cc:        freebsd-stable@freebsd.org, John Baldwin <jhb@freebsd.org>
Subject:   Re: [stable-ish 9] Dell R815 ipmi(4) attach failure
Message-ID:  <201204022327.q32NRDG7070099@ambrisko.com>

next in thread | raw e-mail | index | archive | help
Doug Ambrisko writes:
| John Baldwin writes:
| | On Saturday, March 31, 2012 3:25:48 pm Doug Ambrisko wrote:
| | > Sean Bruno writes:
| | > | Noting a failure to attach to the onboard IPMI controller with this dell
| | > | R815.  Not sure what to start poking at and thought I'd though this over
| | > | here for comment.
| | > | 
| | > | -bash-4.2$ dmesg |grep ipmi
| | > | ipmi0: KCS mode found at io 0xca8 on acpi
| | > | ipmi1: <IPMI System Interface> on isa0
| | > | device_attach: ipmi1 attach returned 16
| | > | ipmi1: <IPMI System Interface> on isa0
| | > | device_attach: ipmi1 attach returned 16
| | > | ipmi0: Timed out waiting for GET_DEVICE_ID
| | > 
| | > I've run into this recently.  A quick hack to fix it is:
| | > 
| | > Index: ipmi.c
| | > ===================================================================
| | > RCS file: /cvs/src/sys/dev/ipmi/ipmi.c,v
| | > retrieving revision 1.14
| | > diff -u -p -r1.14 ipmi.c
| | > --- ipmi.c	14 Apr 2011 07:14:22 -0000	1.14
| | > +++ ipmi.c	31 Mar 2012 19:18:35 -0000
| | > @@ -695,7 +695,6 @@ ipmi_startup(void *arg)
| | >  	if (error == EWOULDBLOCK) {
| | >  		device_printf(dev, "Timed out waiting for GET_DEVICE_ID\n");
| | >  		ipmi_free_request(req);
| | > -		return;
| | >  	} else if (error) {
| | >  		device_printf(dev, "Failed GET_DEVICE_ID: %d\n", error);
| | >  		ipmi_free_request(req);
| | > 
| | > The issue is that the wakeup doesn't actually wake up the msleep
| | > in ipmi_submit_driver_request.  The error being reported is that
| | > the msleep timed out.  This doesn't seem to be critical problem
| | > since after this things seemed to work work.  I saw this on 9.X.
| | > Haven't seen it on 8.2.  Not sure about -current.
| | > 
| | > It doesn't happen on all machines.
| | 
| | Hmm, are you seeing the KCS thread manage the request but the wakeup() is 
| | lost?
| 
| It was a couple of weeks ago that I played with it.  I put printf's
| around the msleep and wakeup.  I saw the wakeup called but the sleep
| not get it.  I can try the test again later today.  Right now my main
| work machine is recovering from a power outage.  This was with 9.0 
| when I first saw it.  This issue seems to only happen at boot time.
| If I kldload the module after the system is booted then it seems to work 
| okay.  The KCS part was working fine and got the data okay from the
| request.  I haven't seen or heard any issues with 8.2.

With -current I patched ipmi.c with:
Index: ipmi.c
===================================================================
--- ipmi.c      (revision 233806)
+++ ipmi.c      (working copy)
@@ -523,7 +523,11 @@
         * waiter that we awaken.
         */
        if (req->ir_owner == NULL)
+{
+device_printf(sc->ipmi_dev, "DEBUG %s %d before wakeup %d\n",__FUNCTION__,__LINE__,ticks);
                wakeup(req);
+device_printf(sc->ipmi_dev, "DEBUG %s %d after wakeup %d\n",__FUNCTION__,__LINE__,ticks);
+}
        else {
                dev = req->ir_owner;
                TAILQ_INSERT_TAIL(&dev->ipmi_completed_requests, req, ir_link);
@@ -543,7 +547,11 @@
        IPMI_LOCK(sc);
        error = sc->ipmi_enqueue_request(sc, req);
        if (error == 0)
+{
+device_printf(sc->ipmi_dev, "DEBUG %s %d before msleep %d\n",__FUNCTION__,__LINE__,ticks);
                error = msleep(req, &sc->ipmi_lock, 0, "ipmireq", timo);
+device_printf(sc->ipmi_dev, "DEBUG %s %d after msleep %d\n",__FUNCTION__,__LINE__,ticks);
+}
        if (error == 0)
                error = req->ir_error;
        IPMI_UNLOCK(sc);
@@ -695,8 +703,11 @@
        error = ipmi_submit_driver_request(sc, req, MAX_TIMEOUT);
        if (error == EWOULDBLOCK) {
                device_printf(dev, "Timed out waiting for GET_DEVICE_ID\n");
+               printf("DJA\n");
+/*
                ipmi_free_request(req);
                return;
+*/
        } else if (error) {
                device_printf(dev, "Failed GET_DEVICE_ID: %d\n", error);
                ipmi_free_request(req);

and get
  # dmesg | grep ipmi
  ipmi0: KCS mode found at io 0xca8 on acpi
  ipmi1: <IPMI System Interface> on isa0
  device_attach: ipmi1 attach returned 16
  ipmi1: <IPMI System Interface> on isa0
  device_attach: ipmi1 attach returned 16
  ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2
  ipmi0: DEBUG ipmi_complete_request 527 before wakeup 6201
  ipmi0: DEBUG ipmi_complete_request 529 after wakeup 6263
  ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 6323
  ipmi0: Timed out waiting for GET_DEVICE_ID
  ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0
  ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 6503
  ipmi0: DEBUG ipmi_complete_request 527 before wakeup 6620
  ipmi0: DEBUG ipmi_complete_request 529 after wakeup 6682
  ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 6742
  	...
  # sysctl kern.clockrate
  kern.clockrate: { hz = 1000, tick = 1000, profhz = 8126, stathz = 127 }
  # 

The 6000+ ticks before calling wakeup looks to be a problem.  Maybe
a few things are getting delayed for a while.

On a working machine I see:
  # dmesg | grep ipmi
  ipmi0: <IPMI System Interface> port 0xca2,0xca3 on acpi0
  ipmi0: KCS mode found at io 0xca2 on acpi
  ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 1
  ipmi0: DEBUG ipmi_complete_request 527 before wakeup 397
  ipmi0: DEBUG ipmi_complete_request 529 after wakeup 464
  ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 539
  ipmi0: IPMI device rev. 1, firmware rev. 2.03, version 2.0
  	...

if I kldunload/kldload on the problem machine I see:
  ipmi0: KCS mode found at io 0xca8 on acpi
  ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 675117
  ipmi0: DEBUG ipmi_complete_request 527 before wakeup 675189
  ipmi0: DEBUG ipmi_complete_request 529 after wakeup 675253
  ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 675315
  ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0
	...

Another machine that fails:
  ipmi0: <IPMI System Interface> port 0xca8,0xcac on acpi0
  ipmi0: KCS mode found at io 0xca8 on acpi
  ipmi0: KCS error: ff
  ipmi1: <IPMI System Interface> on isa0
  device_attach: ipmi1 attach returned 16
  ipmi1: <IPMI System Interface> on isa0
  device_attach: ipmi1 attach returned 16
  ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 0
  ipmi0: DEBUG ipmi_complete_request 527 before wakeup 3046
  ipmi0: DEBUG ipmi_complete_request 529 after wakeup 3168
  ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 3234
  ipmi0: Timed out waiting for GET_DEVICE_ID
  ipmi0: IPMI device rev. 0, firmware rev. 1.70, version 1.5
	...

FWIW, Dell machine are tending to have this problem.  I wonder, if
the multiple probes are messing something up.

Thanks,

Doug A.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201204022327.q32NRDG7070099>