Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Aug 2012 08:59:09 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-net@freebsd.org
Cc:        net@freebsd.org, Navdeep Parhar <np@freebsd.org>, Vijay Singh <vijju.singh@gmail.com>
Subject:   Re: witness warning in arp processing
Message-ID:  <201208300859.09759.jhb@freebsd.org>
In-Reply-To: <503E7F69.9070108@FreeBSD.org>
References:  <CALCNsJR-Gp05T10Bdf51zK15OcOyf=7qWZXNLEw-T6nxLsVArw@mail.gmail.com> <503E7F69.9070108@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, August 29, 2012 4:45:29 pm Navdeep Parhar wrote:
> On 08/29/12 10:30, Vijay Singh wrote:
> > All, I am seeing this warning on my 8.2 based system.
> >
> > taskqueue_drain with the following non-sleepable locks held:
> > exclusive rw lle (lle) r = 0 (0xffffff0014dc9110) locked @ sys/netinet/in.c:1760
> > KDB: stack backtrace:
> > kdb_backtrace() at kdb_backtrace+0x3e
> > _witness_debugger() at _witness_debugger+0x24
> > witness_warn() at witness_warn+0x402
> > taskqueue_drain() at taskqueue_drain+0x36
> > cancel_delayed_work() at cancel_delayed_work+0x56
> > set_timeout() at set_timeout+0x18
> > netevent_callback() at netevent_callback+0x29
> > _handle_arp_update_event() at _handle_arp_update_event+0x31
> > in_arpinput() at in_arpinput+0xe92
> > arpintr() at arpintr+0x255
> > netisr_dispatch_src() at netisr_dispatch_src+0x14a
> > netisr_dispatch() at netisr_dispatch+0x20
> > ether_demux() at ether_demux+0x281
> > ether_input_internal() at ether_input_internal+0x60c
> > ether_nh_input() at ether_nh_input+0x1d
> > netisr_dispatch_src() at netisr_dispatch_src+0x14a
> > netisr_dispatch() at netisr_dispatch+0x20
> > ether_input() at ether_input+0xef
> > lem_rxeof() at lem_rxeof+0x6ee
> > lem_handle_rxtx() at lem_handle_rxtx+0x4f
> > taskqueue_run_locked() at taskqueue_run_locked+0x145
> > taskqueue_thread_loop() at taskqueue_thread_loop+0x73
> > fork_exit() at fork_exit+0x180
> > fork_trampoline() at fork_trampoline+0xe
> >
> > Is this a known issue? Has it been fixed?
> 
> This is a bug in the OFED code.  The event handler it registers for the 
> ARP update is not supposed to do anything that could sleep..

You could try this:

Index: ofed/include/linux/workqueue.h
===================================================================
--- ofed/include/linux/workqueue.h      (revision 239905)
+++ ofed/include/linux/workqueue.h      (working copy)
@@ -184,9 +184,9 @@ cancel_delayed_work(struct delayed_work *work)
 {
 
 	callout_stop(&work->timer);
-	if (work->work.taskqueue &&
-	    taskqueue_cancel(work->work.taskqueue, &work->work.work_task, NULL))
-		taskqueue_drain(work->work.taskqueue, &work->work.work_task);
+	if (work->work.taskqueue)
+		taskqueue_cancel(work->work.taskqueue, &work->work.work_task,
+		    NULL);
 	return 0;
 }
 

This changes the code to match the comment above cancel_delayed_work()
and should fix this warning:

/*
 * This may leave work running on another CPU as it does on Linux.
 */
static inline int
cancel_delayed_work(struct delayed_work *work)

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201208300859.09759.jhb>