From owner-freebsd-net@FreeBSD.ORG Thu Aug 30 13:01:16 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 976AF106566C; Thu, 30 Aug 2012 13:01:16 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 6BEBF8FC08; Thu, 30 Aug 2012 13:01:16 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id CCFFDB946; Thu, 30 Aug 2012 09:01:15 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Date: Thu, 30 Aug 2012 08:59:09 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p17; KDE/4.5.5; amd64; ; ) References: <503E7F69.9070108@FreeBSD.org> In-Reply-To: <503E7F69.9070108@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201208300859.09759.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 30 Aug 2012 09:01:15 -0400 (EDT) Cc: net@freebsd.org, Navdeep Parhar , Vijay Singh Subject: Re: witness warning in arp processing X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Aug 2012 13:01:16 -0000 On Wednesday, August 29, 2012 4:45:29 pm Navdeep Parhar wrote: > On 08/29/12 10:30, Vijay Singh wrote: > > All, I am seeing this warning on my 8.2 based system. > > > > taskqueue_drain with the following non-sleepable locks held: > > exclusive rw lle (lle) r = 0 (0xffffff0014dc9110) locked @ sys/netinet/in.c:1760 > > KDB: stack backtrace: > > kdb_backtrace() at kdb_backtrace+0x3e > > _witness_debugger() at _witness_debugger+0x24 > > witness_warn() at witness_warn+0x402 > > taskqueue_drain() at taskqueue_drain+0x36 > > cancel_delayed_work() at cancel_delayed_work+0x56 > > set_timeout() at set_timeout+0x18 > > netevent_callback() at netevent_callback+0x29 > > _handle_arp_update_event() at _handle_arp_update_event+0x31 > > in_arpinput() at in_arpinput+0xe92 > > arpintr() at arpintr+0x255 > > netisr_dispatch_src() at netisr_dispatch_src+0x14a > > netisr_dispatch() at netisr_dispatch+0x20 > > ether_demux() at ether_demux+0x281 > > ether_input_internal() at ether_input_internal+0x60c > > ether_nh_input() at ether_nh_input+0x1d > > netisr_dispatch_src() at netisr_dispatch_src+0x14a > > netisr_dispatch() at netisr_dispatch+0x20 > > ether_input() at ether_input+0xef > > lem_rxeof() at lem_rxeof+0x6ee > > lem_handle_rxtx() at lem_handle_rxtx+0x4f > > taskqueue_run_locked() at taskqueue_run_locked+0x145 > > taskqueue_thread_loop() at taskqueue_thread_loop+0x73 > > fork_exit() at fork_exit+0x180 > > fork_trampoline() at fork_trampoline+0xe > > > > Is this a known issue? Has it been fixed? > > This is a bug in the OFED code. The event handler it registers for the > ARP update is not supposed to do anything that could sleep.. You could try this: Index: ofed/include/linux/workqueue.h =================================================================== --- ofed/include/linux/workqueue.h (revision 239905) +++ ofed/include/linux/workqueue.h (working copy) @@ -184,9 +184,9 @@ cancel_delayed_work(struct delayed_work *work) { callout_stop(&work->timer); - if (work->work.taskqueue && - taskqueue_cancel(work->work.taskqueue, &work->work.work_task, NULL)) - taskqueue_drain(work->work.taskqueue, &work->work.work_task); + if (work->work.taskqueue) + taskqueue_cancel(work->work.taskqueue, &work->work.work_task, + NULL); return 0; } This changes the code to match the comment above cancel_delayed_work() and should fix this warning: /* * This may leave work running on another CPU as it does on Linux. */ static inline int cancel_delayed_work(struct delayed_work *work) -- John Baldwin