From owner-freebsd-net@FreeBSD.ORG Wed Sep 26 19:08:24 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 229091065672; Wed, 26 Sep 2012 19:08:24 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-qc0-f182.google.com (mail-qc0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id B1EEC8FC18; Wed, 26 Sep 2012 19:08:23 +0000 (UTC) Received: by qcsl39 with SMTP id l39so941444qcs.13 for ; Wed, 26 Sep 2012 12:08:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=yopeyFXc5t5o/vdXC9/VKUgFuot46TPsc5moXcImtPA=; b=H5V1r8M1pUl/A1zCHNM2YpDZDvR6oZkILY/cQNogz9+ptuV69jcMyk3wq9co9SNIQB Zb7E4fJyi7p+QRRRieKANR+y1uatFYnr068auMCC6ITml40swxazrIci8JCbNTMe4fvi 6r155qDjFEc5Kv7WstNf5We3M0BxJ5thLF/CHIQTj2TPaHloIVlijr3lZKMFzlUh6YQ+ QAtw2DIq/rQEPDgwOTEREDjeqVr1SMAQqxmh69djpOTYe7x5KDjMr/zeNHO9KuwZP1+Z J5HkJm58+ClkOsV6DGVPoi5FD6mdLPk2etqomIstM3p4E+qC/L3/tMp5n6f1ozGnvJOO janQ== MIME-Version: 1.0 Received: by 10.224.181.198 with SMTP id bz6mr3081574qab.97.1348686502688; Wed, 26 Sep 2012 12:08:22 -0700 (PDT) Received: by 10.49.50.103 with HTTP; Wed, 26 Sep 2012 12:08:22 -0700 (PDT) In-Reply-To: <201209260955.14417.jhb@freebsd.org> References: <201209260955.14417.jhb@freebsd.org> Date: Wed, 26 Sep 2012 15:08:22 -0400 Message-ID: From: Ryan Stone To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org, Jack Vogel , Vijay Singh Subject: Re: ixgbe rx & tx locks X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Sep 2012 19:08:24 -0000 On Wed, Sep 26, 2012 at 9:55 AM, John Baldwin wrote: > You only have to drop the RX lock around if_input() if you use the same lock > for both TX and RX (as if_transmit() / if_start() can be invoked while locks > in the network stack are held). Last time I checked(FreeBSD 8.2), this is not true. The problematic (and convoluted) ordering is: ix:rx -> udp -> udpinp -> in_multi_mtx -> ix:core -> ix:rx udp -> udpinp -> in_multi_mtx is defined in subr_witness.c. ix:core -> ix:rx is fairly obvious, and happens in places like ixgbe_init. ix:rx -> udp is also fairly obvious (here's one backtrace): lock order reversal:^M^M 1st 0xffffff800153c138 ix:rx (ix:rx) @ src/sys/dev/ixgbe/ixgbe.c:7113^M^M 2nd 0xffffffff80af9c48 udp (udp) @ src/sys/netinet/udp_usrreq.c:471^M^M KDB: stack backtrace:^M^M db_trace_self_wrapper() at 0xffffffff801dd5aa = db_trace_self_wrapper+0x2a^M^M _witness_debugger() at 0xffffffff8044411e = _witness_debugger+0x2e^M^M witness_checkorder() at 0xffffffff804453c7 = witness_checkorder+0x807^M^M _rw_rlock() at 0xffffffff803fb61a = _rw_rlock+0x7a^M^M udp_input() at 0xffffffff80517d1c = udp_input+0x1bc^M^M ip_input() at 0xffffffff804f6b32 = ip_input+0x1e2^M^M netisr_dispatch_src() at 0xffffffff804bfc38 = netisr_dispatch_src+0xb8^M^M ether_demux() at 0xffffffff804b0fca = ether_demux+0x1aa^M^M ether_input() at 0xffffffff804b141a = ether_input+0x1ca^M^M ixgbe_rxeof() at 0xffffffff802d8ba3 = ixgbe_rxeof+0x203^M^M ixgbe_msix_que() at 0xffffffff802e1790 = ixgbe_msix_que+0xf0^M^M intr_event_execute_handlers() at 0xffffffff803d4096 = intr_event_execute_handler s+0x66^M^M ithread_loop() at 0xffffffff803d4e12 = ithread_loop+0xb2^M^M fork_exit() at 0xffffffff803d1fba = fork_exit+0x12a^M^M fork_trampoline() at 0xffffffff805f582e = fork_trampoline+0xe^M^M --- trap 0, rip = 0, rsp = 0xffffff8000148d00, rbp = 0 ---^M^M in_multi_mtx -> ix:core comes from the following backtrace: lock order reversal: 1st 0xffffffff80ae2440 in_multi_mtx (in_multi_mtx) @ src/sys/netinet/in_mcast.c:1095 2nd 0xffffff8001539400 ixgbe0 (IXGBE Core Lock) @ src/sys/dev/ixgbe/ixgbe.c:1725 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff801dd5aa = db_trace_self_wrapper+0x2a _witness_debugger() at 0xffffffff8044411e = _witness_debugger+0x2e witness_checkorder() at 0xffffffff804453c7 = witness_checkorder+0x807 _mtx_lock_flags() at 0xffffffff803ec3ba = _mtx_lock_flags+0x8a ixgbe_ioctl() at 0xffffffff802e07ae = ixgbe_ioctl+0x60e if_addmulti() at 0xffffffff804a9f7b = if_addmulti+0x19b in_joingroup_locked() at 0xffffffff804db8ec = in_joingroup_locked+0x1bc in_joingroup() at 0xffffffff804dd5a2 = in_joingroup+0x52 in_control() at 0xffffffff804d7a70 = in_control+0x1160 ifioctl() at 0xffffffff804adec6 = ifioctl+0x5b6 nfs_mountroot() at 0xffffffff80567244 = nfs_mountroot+0x94 nfs_mount() at 0xffffffff80567b7b = nfs_mount+0x4db vfs_donmount() at 0xffffffff8048919e = vfs_donmount+0xcde kernel_mount() at 0xffffffff80489a71 = kernel_mount+0xa1 vfs_mountroot_try() at 0xffffffff80489f9d = vfs_mountroot_try+0x17d vfs_mountroot() at 0xffffffff8048ab7d = vfs_mountroot+0x4fd start_init() at 0xffffffff803b2932 = start_init+0x62 fork_exit() at 0xffffffff803d1fba = fork_exit+0x12a fork_trampoline() at 0xffffffff805f582e = fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff8000042d00, rbp = 0 ---