Date: Wed, 1 Aug 2012 08:53:11 -0400 From: John Baldwin <jhb@freebsd.org> To: attilio@freebsd.org Cc: freebsd-stable@freebsd.org Subject: Re: [stable 9] panic on reboot: ipmi_wd_event() Message-ID: <201208010853.11447.jhb@freebsd.org> In-Reply-To: <CAJ-FndC3pyfJNJBZMZEW9WGs7yA=xeAD2vMyuEeJjELcLOVbOA@mail.gmail.com> References: <1342742294.2656.24.camel@powernoodle.corp.yahoo.com> <201207311634.24169.jhb@freebsd.org> <CAJ-FndC3pyfJNJBZMZEW9WGs7yA=xeAD2vMyuEeJjELcLOVbOA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, July 31, 2012 4:51:19 pm Attilio Rao wrote:
> On 7/31/12, John Baldwin <jhb@freebsd.org> wrote:
> > On Thursday, July 19, 2012 7:58:14 pm Sean Bruno wrote:
> >> Working on the Dell R420 today, got most of it working, even the
> >> broadcom ethernet cards! However, I get the following when I reboot the
> >> system:
> >>
> >> Syncing disks, vnodes remaining...4 Sleeping thread (tid 100107, pid 9)
> >> owns a non-sleepable lock
> >> KDB: stack backtrace of thread 100107:
> >> sched_switch() at sched_switch+0x19f
> >> mi_switch() at mi_switch+0x208
> >> sleepq_switch() at sleepq_switch+0xfc
> >> sleepq_wait() at sleepq_wait+0x4d
> >> _sleep() at _sleep+0x3f6
> >> ipmi_submit_driver_request() at ipmi_submit_driver_request+0x97
> >> ipmi_set_watchdog() at ipmi_set_watchdog+0xb1
> >> ipmi_wd_event() at ipmi_wd_event+0x8f
> >> kern_do_pat() at kern_do_pat+0x10f
> >> sched_sync() at sched_sync+0x1ea
> >> fork_exit() at fork_exit+0x135
> >> fork_trampoline() at fork_trampoline+0xe
> >
> > Hmmm, the watchdog pat should probably happen without holding locks if
> > possible. This is related to the IPMI watchdog being special and wanting
> > to schedule a thread to work.
>
> The watchdog pat without the locks is not easy to do because we
> register the watchdog callbacks in eventhandlers, which are indeed
> locked (and you may also end up racing against watchdog detach, if you
> don't use any lock at all).
No, eventhandlers go through several hoops to not hold any locks while
the eventhandler functions are running. It seems in this case that a
lock is held in a higher layer (sched_sync()) and that is what I was
talking about. Yes, it is the 'sync_mtx' that is held. Something like this
may work:
Index: vfs_subr.c
===================================================================
--- vfs_subr.c (revision 238969)
+++ vfs_subr.c (working copy)
@@ -1868,8 +1868,11 @@ sched_sync(void)
continue;
}
- if (first_printf == 0)
+ if (first_printf == 0) {
+ mtx_unlock(&sync_mtx);
wdog_kern_pat(WD_LASTVAL);
+ mtx_lock(&sync_mtx);
+ }
}
if (!LIST_EMPTY(gslp)) {
--
John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201208010853.11447.jhb>
