From owner-freebsd-net@FreeBSD.ORG Sat Dec 15 02:58:16 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 047A617B; Sat, 15 Dec 2012 02:58:16 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id A9C1B8FC16; Sat, 15 Dec 2012 02:58:15 +0000 (UTC) Received: by mail-ob0-f182.google.com with SMTP id 16so4072702obc.13 for ; Fri, 14 Dec 2012 18:58:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=bpao4GkdnNftxJLx0fQ93eUa1Se75pVjABGOYcuICPA=; b=pxoizyyB0cjtrZlaIbx5yUH3Qg3qwEPvlT7Nzu4CWiP1gGud0qZZJKUI7liZ2ZD0sF APO7RzQ05sgkoZgeeMoT9sTl3Og1S0Uve8UGkB+BTQ/iDC8d6QfR03EL6ka8UwYrMjqd vHboYIcbrPqOO3J5rDienrfVBDY8Ecq4N08uE64ZDAJFhcKMBJ10fw+kA6Xf0tU57UTU hIBrIq+bNPx/3aDY+aMM01ac4sidG9YrBn9BuTm/RtB3euIPSUhsB4ptG4VVkY2bRzpl YyDIE11+a1Ht0MIQwlDQspjzcf0XgGSXTVlQqAZrjmCejRRemCcEa5G7wD2zEHgpmwgf FveA== MIME-Version: 1.0 Received: by 10.182.172.74 with SMTP id ba10mr6309049obc.83.1355540294941; Fri, 14 Dec 2012 18:58:14 -0800 (PST) Received: by 10.76.143.33 with HTTP; Fri, 14 Dec 2012 18:58:14 -0800 (PST) Date: Fri, 14 Dec 2012 18:58:14 -0800 Message-ID: Subject: LOR with ixgbe+lagg and panic with ixgbe related to an uninitialized stack variable From: Garrett Cooper To: Jack F Vogel Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Dec 2012 02:58:16 -0000 Hi, Seeing the following LOR on CURRENT when scping files over two L3 lagged ixgbe interfaces: lock order reversal: 1st 0xfffffe000d15a118 ix0:rx(1) (ix0:rx(1)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4353 2nd 0xfffffe01334ada08 if_lagg rwlock (if_lagg rwlock) @ /usr/src/sys/modules/if_lagg/../../net/if_lagg.c:1276 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496fc5740 kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496fc57f0 witness_checkorder() at witness_checkorder+0xc00/frame 0xffffff8496fc5880 __rw_rlock() at __rw_rlock+0x98/frame 0xffffff8496fc5920 lagg_input() at lagg_input+0x38/frame 0xffffff8496fc5960 ether_nh_input() at ether_nh_input+0x171/frame 0xffffff8496fc5990 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xffffff8496fc5a00 tcp_lro_flush() at tcp_lro_flush+0x197/frame 0xffffff8496fc5a20 ixgbe_rxeof() at ixgbe_rxeof+0x5f2/frame 0xffffff8496fc5ad0 ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xffffff8496fc5b20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xffffff8496fc5b60 ithread_loop() at ithread_loop+0x161/frame 0xffffff8496fc5bb0 fork_exit() at fork_exit+0x84/frame 0xffffff8496fc5bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xffffff8496fc5bf0 --- trap 0, rip = 0, rsp = 0xffffff8496fc5cb0, rbp = 0 --- lock order reversal: 1st 0xfffffe000d15a118 ix0:rx(1) (ix0:rx(1)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4353 2nd 0xfffffe0133003da8 tcpinp (tcpinp) @ /usr/src/sys/netinet/in_pcb.c:1785 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496fc5570 kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496fc5620 witness_checkorder() at witness_checkorder+0xc00/frame 0xffffff8496fc56b0 _rw_wlock_cookie() at _rw_wlock_cookie+0x63/frame 0xffffff8496fc56f0 in_pcblookup_hash() at in_pcblookup_hash+0xba/frame 0xffffff8496fc5740 tcp_input() at tcp_input+0x60e/frame 0xffffff8496fc5870 ip_input() at ip_input+0xb2/frame 0xffffff8496fc58c0 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xffffff8496fc5930 ether_demux() at ether_demux+0x143/frame 0xffffff8496fc5960 ether_nh_input() at ether_nh_input+0x325/frame 0xffffff8496fc5990 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xffffff8496fc5a00 tcp_lro_flush() at tcp_lro_flush+0x197/frame 0xffffff8496fc5a20 ixgbe_rxeof() at ixgbe_rxeof+0x5f2/frame 0xffffff8496fc5ad0 ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xffffff8496fc5b20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xffffff8496fc5b60 ithread_loop() at ithread_loop+0x161/frame 0xffffff8496fc5bb0 fork_exit() at fork_exit+0x84/frame 0xffffff8496fc5bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xffffff8496fc5bf0 --- trap 0, rip = 0, rsp = 0xffffff8496fc5cb0, rbp = 0 --- # uname -a FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #3 r+5a05236: Wed Dec 12 17:35:14 PST 2012 root@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC amd64 I ran into a panic under similar conditions with a slightly older kernel (12/05): Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex ix1:rx(0) (ix1:rx(0)) r = 0 (0xfffffe000d14e808) locked @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4353 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff849702d530 kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff849702d5e0 witness_warn() at witness_warn+0x4a3/frame 0xffffff849702d6a0 trap_pfault() at trap_pfault+0x5a/frame 0xffffff849702d750 trap() at trap+0x659/frame 0xffffff849702d960 calltrap() at calltrap+0x8/frame 0xffffff849702d960 --- trap 0xc, rip = 0xffffffff8185a82f, rsp = 0xffffff849702da20, rbp = 0xffffff849702dad0 --- ixgbe_rxeof() at ixgbe_rxeof+0x20f/frame 0xffffff849702dad0 ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xffffff849702db20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xffffff849702db60 ithread_loop() at ithread_loop+0x161/frame 0xffffff849702dbb0 fork_exit() at fork_exit+0x84/frame 0xffffff849702dbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xffffff849702dbf0 --- trap 0, rip = 0, rsp = 0xffffff849702dcb0, rbp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x18 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff8185a82f stack pointer = 0x28:0xffffff849702da20 frame pointer = 0x28:0xffffff849702dad0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 db> x/s version version: FreeBSD 10.0-CURRENT #3 r+5a05236: Wed Dec 12 17:35:14 PST 2012\012 root@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC\012 db> show alllocks Process 1052 (syslogd) thread 0xfffffe000adfe000 (100180) exclusive lockmgr bufwait (bufwait) r = 0 (0xffffff8454152ba0) locked @ /usr/src/sys/kern/vfs_bio.c:2633 exclusive lockmgr ufs (ufs) r = 0 (0xfffffe015d0e9668) locked @ /usr/src/sys/kern/vfs_syscalls.c:3438 Process 12 (intr) thread 0xfffffe000adcf900 (100208) exclusive sleep mutex ix1:rx(0) (ix1:rx(0)) r = 0 (0xfffffe000d14e808) locked @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4353 I don't have much to go off of for the panic, but I figured I should just post these "in case" these are potentially known issues, or if they aren't known, potential items to watch for. Thoughts? -Garrett