From owner-freebsd-bugs@FreeBSD.ORG Wed Feb 15 18:50:08 2006 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6680F16A420 for ; Wed, 15 Feb 2006 18:50:08 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2E64F43D48 for ; Wed, 15 Feb 2006 18:50:08 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k1FIo7lf056369 for ; Wed, 15 Feb 2006 18:50:07 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id k1FIo7aY056362; Wed, 15 Feb 2006 18:50:07 GMT (envelope-from gnats) Date: Wed, 15 Feb 2006 18:50:07 GMT Message-Id: <200602151850.k1FIo7aY056362@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Robert Huff Cc: Subject: Re: kern/86427: LOR / Deadlock with FASTIPSEC and nat X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Robert Huff List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Feb 2006 18:50:08 -0000 The following reply was made to PR kern/86427; it has been noted by GNATS. From: Robert Huff To: bug-followup@FreeBSD.org, mike@sentex.net Cc: Subject: Re: kern/86427: LOR / Deadlock with FASTIPSEC and nat Date: Wed, 15 Feb 2006 13:41:32 -0500 (I don't know this will be useful, but in case ....) Running: huff@jerusalem>> uname -v FreeBSD 7.0-CURRENT #0: Fri Jan 13 13:21:14 EST 2006 at the last re-boot I got this: Feb 14 07:42:47 jerusalem kernel: lock order reversal: Feb 14 07:42:47 jerusalem kernel: 1st 0xc3628090 inp (divinp) @ /usr/src/sys/netinet/ip_divert.c:328 Feb 14 07:42:47 jerusalem kernel: 2nd 0xc0764fe8 in_multi_mtx (in_multi_mtx) @ /usr/src/sys/netinet/ip_output.c:291 Feb 14 07:42:47 jerusalem kernel: KDB: stack backtrace: Feb 14 07:42:47 jerusalem kernel: kdb_backtrace(c06b7d5c,c0764fe8,c06b772c,c06b772c,c06c0c61) at kdb_backtrace+0x2f Feb 14 07:42:47 jerusalem kernel: witness_checkorder(c0764fe8,9,c06c0c61,123,c06beeb1) at witness_checkorder+0x6e4 Feb 14 07:42:47 jerusalem kernel: _mtx_lock_flags(c0764fe8,0,c06c0c61,123,c0542b9f) at _mtx_lock_flags+0x8b Feb 14 07:42:47 jerusalem kernel: ip_output(c394d700,0,d56daafc,22,0) at ip_output+0x460 Feb 14 07:42:47 jerusalem kernel: div_output(c35fd3e4,c394d700,c3522760,0,d56dabb8) at div_output+0x1d5 Feb 14 07:42:47 jerusalem kernel: div_send(c35fd3e4,0,c394d700,c3522760,0) at div_send+0x5d Feb 14 07:42:47 jerusalem kernel: sosend(c35fd3e4,c3522760,d56dabe4,c394d700,0) at sosend+0x49e Feb 14 07:42:47 jerusalem kernel: kern_sendit(c339b300,3,d56dac64,0,0) at kern_sendit+0x106 Feb 14 07:42:47 jerusalem kernel: sendit(c339b300,3,d56dac64,0,bfbeedb0) at sendit+0x1a8 Feb 14 07:42:47 jerusalem kernel: sendto(c339b300,d56dad04,18,43c,6) at sendto+0x5b Feb 14 07:42:47 jerusalem kernel: syscall(3b,3b,3b,bfbeed90,2) at syscall+0x2a6 Feb 14 07:42:47 jerusalem kernel: Xint0x80_syscall() at Xint0x80_syscall+0x1f Feb 14 07:42:47 jerusalem kernel: --- syscall (133, FreeBSD ELF32, sendto), eip = 0x4814230b, esp = 0xbfbeecfc, ebp = 0xbfbfeda8 --- which was was expected, and then this, which was new: Feb 14 07:43:22 jerusalem kernel: lock order reversal: Feb 14 07:43:22 jerusalem kernel: 1st 0xc3629480 inp (rawinp) @ /usr/src/sys/netinet/raw_ip.c:202 Feb 14 07:43:22 jerusalem kernel: 2nd 0xc36293d8 inp (raw6inp) @ /usr/src/sys/netinet/raw_ip.c:202 Feb 14 07:43:22 jerusalem kernel: KDB: stack backtrace: Feb 14 07:43:22 jerusalem kernel: kdb_backtrace(c06b7d5c,c36293d8,c06c4e3d,c06c4ca8,c06c0cdb) at kdb_backtrace+0x2f Feb 14 07:43:22 jerusalem kernel: witness_checkorder(c36293d8,9,c06c0cdb,ca,246) at witness_checkorder+0x6e4 Feb 14 07:43:22 jerusalem kernel: _mtx_lock_flags(c36293d8,0,c06c0cdb,ca,1) at _mtx_lock_flags+0x8b Feb 14 07:43:22 jerusalem kernel: rip_input(c394d600,14,0,d4477be8,c0542b9f) at rip_input+0x7b Feb 14 07:43:22 jerusalem kernel: icmp_input(c394d600,14,c3429000,1,0) at icmp_input+0x511 Feb 14 07:43:22 jerusalem kernel: ip_input(c394d600,0,c06bec13,e9,c0764bd8) at ip_input+0x656 Feb 14 07:43:22 jerusalem kernel: netisr_processqueue(c0764bd8,0,c06bec13,153,c32b41c0) at netisr_processqueue+0x8a Feb 14 07:43:22 jerusalem kernel: swi_net(0,d4477cdc,c050e8f0,c0719970,1) at swi_net+0xa4 Feb 14 07:43:22 jerusalem kernel: ithread_execute_handlers(c32acac8,c32a0500,c06b14b0,2f9,c32ad780) at ithread_execute_handlers+0x10d Feb 14 07:43:22 jerusalem kernel: ithread_loop(c327e6c0,d4477d38,c06b12e3,30e,c327e6c0) at ithread_loop+0x77 Feb 14 07:43:22 jerusalem kernel: fork_exit(c0502d1c,c327e6c0,d4477d38) at fork_exit+0xc5 Feb 14 07:43:22 jerusalem kernel: fork_trampoline() at fork_trampoline+0x8 Feb 14 07:43:22 jerusalem kernel: --- trap 0x1, eip = 0, esp = 0xd4477d6c, ebp = 0 --- The machine in question is still functional: huff@jerusalem>> uptime 1:39PM up 1 day, 5:57, 6 users, load averages: 1.39, 2.44, 2.99 and I believe (but have no hard data) Robert's patch has reduced the impact (time between failure was smallnum days, now smallnum weeks). Robert Huff