Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Nov 2007 08:58:06 +1030
From:      Benjamin Close <Benjamin.Close@clearchain.com>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        Jack F Vogel <jfv@freebsd.org>, freebsd-current@freebsd.org
Subject:   Re: em0 panic: mutex em0 not owned
Message-ID:  <474B4876.9020209@clearchain.com>
In-Reply-To: <3bbf2fe10711220427s38561159k4879d0c792d09694@mail.gmail.com>
References:  <4744EBA4.7020209@clearchain.com>	 <3bbf2fe10711220408t30136430s15fdba0ebcbbfa66@mail.gmail.com> <3bbf2fe10711220427s38561159k4879d0c792d09694@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Attilio Rao wrote:
> 2007/11/22, Attilio Rao <attilio@freebsd.org>:
>   
>> 2007/11/22, Benjamin Close <Benjamin.Close@clearchain.com>:
>>     
>>> Hi Folks,
>>>     With a recent current I'm now getting panics when em0 tries to come up:
>>>
>>> panic: mutex em0 not owned at ../../../kern/kern_mutex.c:144
>>>
>>> _mtx_assert() + 0xdc
>>> _callout_stop_safe()+0x5d
>>> em_stop() + 0x50 (if_em.c:2546)
>>> em_init_locked()+0x47 (if_em.c:1256)
>>> em_ioctl()+0x466
>>> ifhwioctl() + 0x75f
>>> ifioctl() +0xb0
>>> kern_ioctl() + 0xa3
>>>
>>> This is even after atillos, latest patch.
>>>       
>> Yes, this is a race access to callout_stop() in em driver.
>> callout_stop() needs to be called with callout-specific lock held
>> otherwise you can get a race and this seems not happening. I just
>> inserted this assertions in order to catch bugs like these.
>> I have no time to double-check it, can you do?
>>     
>
> Ok, basically em_stop() both wants to stop core callout and tx channel
> callout but it only holds core lock. It needs to hold both in order to
> stop both. As I'm not sure about lock ordering there I can't produce a
> patch now so the ball is in jfv@ court (CC'ed).
>   
The attached patch fixes the panic at the cost of a lock order reversal:

Index: if_em.c
===================================================================
RCS file: /devel/FreeBSD/ncvs/src/sys/dev/em/if_em.c,v
retrieving revision 1.187
diff -u -r1.187 if_em.c
--- if_em.c    21 Nov 2007 12:55:33 -0000    1.187
+++ if_em.c    25 Nov 2007 23:46:49 -0000
@@ -2541,7 +2541,9 @@
 
     em_disable_intr(adapter);
     callout_stop(&adapter->timer);
+    EM_TX_LOCK(adapter);
     callout_stop(&adapter->tx_fifo_timer);
+    EM_TX_UNLOCK(adapter);
 
     /* Tell the stack that the interface is no longer active */
     ifp->if_drv_flags &= ~(IFF_DRV_RUNNING | IFF_DRV_OACTIVE);


acquiring duplicate lock of same type: "network driver"
 1st em0 @ dev/em/if_em.c:1073
 2nd em0 @ dev/em/if_em.c:2543
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
witness_checkorder() at witness_checkorder+0x605
_mtx_lock_flags() at _mtx_lock_flags+0x75
em_stop() at em_stop+0x63
em_init_locked() at em_init_locked+0x47
em_ioctl() at em_ioctl+0x466
ifhwioctl() at ifhwioctl+0x75f
ifioctl() at ifioctl+0xb0
kern_ioctl() at kern_ioctl+0xa3
ioctl() at ioctl+0xfa
syscall() at syscall+0x1ce
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800825c6c, rsp = 
0x7fffffffe568, rbp = 0x7fffffffe570 ---



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?474B4876.9020209>