Date: Wed, 1 Aug 2012 08:53:11 -0400 From: John Baldwin <jhb@freebsd.org> To: attilio@freebsd.org Cc: freebsd-stable@freebsd.org Subject: Re: [stable 9] panic on reboot: ipmi_wd_event() Message-ID: <201208010853.11447.jhb@freebsd.org> In-Reply-To: <CAJ-FndC3pyfJNJBZMZEW9WGs7yA=xeAD2vMyuEeJjELcLOVbOA@mail.gmail.com> References: <1342742294.2656.24.camel@powernoodle.corp.yahoo.com> <201207311634.24169.jhb@freebsd.org> <CAJ-FndC3pyfJNJBZMZEW9WGs7yA=xeAD2vMyuEeJjELcLOVbOA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, July 31, 2012 4:51:19 pm Attilio Rao wrote: > On 7/31/12, John Baldwin <jhb@freebsd.org> wrote: > > On Thursday, July 19, 2012 7:58:14 pm Sean Bruno wrote: > >> Working on the Dell R420 today, got most of it working, even the > >> broadcom ethernet cards! However, I get the following when I reboot the > >> system: > >> > >> Syncing disks, vnodes remaining...4 Sleeping thread (tid 100107, pid 9) > >> owns a non-sleepable lock > >> KDB: stack backtrace of thread 100107: > >> sched_switch() at sched_switch+0x19f > >> mi_switch() at mi_switch+0x208 > >> sleepq_switch() at sleepq_switch+0xfc > >> sleepq_wait() at sleepq_wait+0x4d > >> _sleep() at _sleep+0x3f6 > >> ipmi_submit_driver_request() at ipmi_submit_driver_request+0x97 > >> ipmi_set_watchdog() at ipmi_set_watchdog+0xb1 > >> ipmi_wd_event() at ipmi_wd_event+0x8f > >> kern_do_pat() at kern_do_pat+0x10f > >> sched_sync() at sched_sync+0x1ea > >> fork_exit() at fork_exit+0x135 > >> fork_trampoline() at fork_trampoline+0xe > > > > Hmmm, the watchdog pat should probably happen without holding locks if > > possible. This is related to the IPMI watchdog being special and wanting > > to schedule a thread to work. > > The watchdog pat without the locks is not easy to do because we > register the watchdog callbacks in eventhandlers, which are indeed > locked (and you may also end up racing against watchdog detach, if you > don't use any lock at all). No, eventhandlers go through several hoops to not hold any locks while the eventhandler functions are running. It seems in this case that a lock is held in a higher layer (sched_sync()) and that is what I was talking about. Yes, it is the 'sync_mtx' that is held. Something like this may work: Index: vfs_subr.c =================================================================== --- vfs_subr.c (revision 238969) +++ vfs_subr.c (working copy) @@ -1868,8 +1868,11 @@ sched_sync(void) continue; } - if (first_printf == 0) + if (first_printf == 0) { + mtx_unlock(&sync_mtx); wdog_kern_pat(WD_LASTVAL); + mtx_lock(&sync_mtx); + } } if (!LIST_EMPTY(gslp)) { -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201208010853.11447.jhb>