From owner-freebsd-current@FreeBSD.ORG Mon Nov 26 22:28:21 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FF3716A421 for ; Mon, 26 Nov 2007 22:28:21 +0000 (UTC) (envelope-from Benjamin.Close@clearchain.com) Received: from ipmail05.adl2.internode.on.net (ipmail05.adl2.internode.on.net [203.16.214.145]) by mx1.freebsd.org (Postfix) with ESMTP id A3DB413C459 for ; Mon, 26 Nov 2007 22:28:20 +0000 (UTC) (envelope-from Benjamin.Close@clearchain.com) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ah4FAE/WSkd5LTGL/2dsb2JhbACBWw X-IronPort-AV: E=Sophos;i="4.23,216,1194183000"; d="scan'208";a="3196982" Received: from ppp121-45-49-139.lns11.adl2.internode.on.net (HELO mail.clearchain.com) ([121.45.49.139]) by ipmail05.adl2.internode.on.net with ESMTP; 27 Nov 2007 08:58:17 +1030 Received: from [192.168.155.249] (draco.internal.clearchain.com [192.168.155.249]) (authenticated bits=0) by mail.clearchain.com (8.13.8/8.13.8) with ESMTP id lAQMSFYM092682 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 27 Nov 2007 08:58:15 +1030 (CST) (envelope-from Benjamin.Close@clearchain.com) Message-ID: <474B4876.9020209@clearchain.com> Date: Tue, 27 Nov 2007 08:58:06 +1030 From: Benjamin Close User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: Attilio Rao References: <4744EBA4.7020209@clearchain.com> <3bbf2fe10711220408t30136430s15fdba0ebcbbfa66@mail.gmail.com> <3bbf2fe10711220427s38561159k4879d0c792d09694@mail.gmail.com> In-Reply-To: <3bbf2fe10711220427s38561159k4879d0c792d09694@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV version 0.91.2, clamav-milter version 0.91.2 on pegasus.clearchain.com X-Virus-Status: Clean X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (mail.clearchain.com [192.168.154.1]); Tue, 27 Nov 2007 08:58:15 +1030 (CST) Cc: Jack F Vogel , freebsd-current@freebsd.org Subject: Re: em0 panic: mutex em0 not owned X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Nov 2007 22:28:21 -0000 Attilio Rao wrote: > 2007/11/22, Attilio Rao : > >> 2007/11/22, Benjamin Close : >> >>> Hi Folks, >>> With a recent current I'm now getting panics when em0 tries to come up: >>> >>> panic: mutex em0 not owned at ../../../kern/kern_mutex.c:144 >>> >>> _mtx_assert() + 0xdc >>> _callout_stop_safe()+0x5d >>> em_stop() + 0x50 (if_em.c:2546) >>> em_init_locked()+0x47 (if_em.c:1256) >>> em_ioctl()+0x466 >>> ifhwioctl() + 0x75f >>> ifioctl() +0xb0 >>> kern_ioctl() + 0xa3 >>> >>> This is even after atillos, latest patch. >>> >> Yes, this is a race access to callout_stop() in em driver. >> callout_stop() needs to be called with callout-specific lock held >> otherwise you can get a race and this seems not happening. I just >> inserted this assertions in order to catch bugs like these. >> I have no time to double-check it, can you do? >> > > Ok, basically em_stop() both wants to stop core callout and tx channel > callout but it only holds core lock. It needs to hold both in order to > stop both. As I'm not sure about lock ordering there I can't produce a > patch now so the ball is in jfv@ court (CC'ed). > The attached patch fixes the panic at the cost of a lock order reversal: Index: if_em.c =================================================================== RCS file: /devel/FreeBSD/ncvs/src/sys/dev/em/if_em.c,v retrieving revision 1.187 diff -u -r1.187 if_em.c --- if_em.c 21 Nov 2007 12:55:33 -0000 1.187 +++ if_em.c 25 Nov 2007 23:46:49 -0000 @@ -2541,7 +2541,9 @@ em_disable_intr(adapter); callout_stop(&adapter->timer); + EM_TX_LOCK(adapter); callout_stop(&adapter->tx_fifo_timer); + EM_TX_UNLOCK(adapter); /* Tell the stack that the interface is no longer active */ ifp->if_drv_flags &= ~(IFF_DRV_RUNNING | IFF_DRV_OACTIVE); acquiring duplicate lock of same type: "network driver" 1st em0 @ dev/em/if_em.c:1073 2nd em0 @ dev/em/if_em.c:2543 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a witness_checkorder() at witness_checkorder+0x605 _mtx_lock_flags() at _mtx_lock_flags+0x75 em_stop() at em_stop+0x63 em_init_locked() at em_init_locked+0x47 em_ioctl() at em_ioctl+0x466 ifhwioctl() at ifhwioctl+0x75f ifioctl() at ifioctl+0xb0 kern_ioctl() at kern_ioctl+0xa3 ioctl() at ioctl+0xfa syscall() at syscall+0x1ce Xfast_syscall() at Xfast_syscall+0xab --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800825c6c, rsp = 0x7fffffffe568, rbp = 0x7fffffffe570 ---