From owner-freebsd-bugs@FreeBSD.ORG Mon Mar 28 08:50:12 2011 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7793E106567C for ; Mon, 28 Mar 2011 08:50:12 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id DA2188FC0C for ; Mon, 28 Mar 2011 08:50:10 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2S8oAUv098144 for ; Mon, 28 Mar 2011 08:50:10 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2S8oAEt098143; Mon, 28 Mar 2011 08:50:10 GMT (envelope-from gnats) Resent-Date: Mon, 28 Mar 2011 08:50:10 GMT Resent-Message-Id: <201103280850.p2S8oAEt098143@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Clement LECIGNE Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6C60D106564A for ; Mon, 28 Mar 2011 08:41:14 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 5BE338FC17 for ; Mon, 28 Mar 2011 08:41:14 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id p2S8fDjK002795 for ; Mon, 28 Mar 2011 08:41:13 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id p2S8fDwM002794; Mon, 28 Mar 2011 08:41:13 GMT (envelope-from nobody) Message-Id: <201103280841.p2S8fDwM002794@red.freebsd.org> Date: Mon, 28 Mar 2011 08:41:13 GMT From: Clement LECIGNE To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/155988: RADIX_NODE_HEAD_LOCK_ASSERT in rtexpunge() X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 08:50:12 -0000 >Number: 155988 >Category: kern >Synopsis: RADIX_NODE_HEAD_LOCK_ASSERT in rtexpunge() >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Mar 28 08:50:10 UTC 2011 >Closed-Date: >Last-Modified: >Originator: Clement LECIGNE >Release: 7.3-RELEASE-p3 >Organization: NETASQ >Environment: FreeBSD clem1.netasq.com 7.3-RELEASE-p3 FreeBSD 7.3-RELEASE-p3 #3: Tue Nov 23 08:55:34 CET 2010 root@clem1.netasq.com:/usr/obj/usr/src/sys/GENERIC i386 >Description: During some network stress testing tests, we are encountering spurious kernel crash in the rtexpunge() function. When there is no more entries in arp cache (no more llinfo), arplookup() seems to call rtexpunge without locking the rnh head. Bellow is the backtrace of the kernel crash. (kgdb) bt #0 doadump () at pcpu.h:245 #1 0x40481989 in db_fncall (dummy1=1082175901, dummy2=0, dummy3=-1, dummy4=0x53fd36f0 "") at ../../../ddb/db_command.c:516 #2 0x40481f3f in db_command (last_cmdp=0x4089abd4, cmd_table=0x0, dopager=0) at ../../../ddb/db_command.c:413 #3 0x40481fb4 in db_command_script (command=0x4089bb25 "call doadump") at ../../../ddb/db_command.c:484 #4 0x40485820 in db_script_exec (scriptname=0x4080b1ca "kdb.enter.default", warnifnotfound=Variable "warnifnotfound" is not available. ) at ../../../ddb/db_script.c:302 #5 0x40485911 in db_script_kdbenter (eventname=0x4082e157 "panic") at ../../../ddb/db_script.c:325 #6 0x40483808 in db_trap (type=3, code=0) at ../../../ddb/db_main.c:227 #7 0x406388a4 in kdb_trap (type=3, code=0, tf=0x53fd3928) at ../../../kern/subr_kdb.c:524 #8 0x407cc87f in trap (frame=0x53fd3928) at ../../../i386/i386/trap.c:713 #9 0x407b7dfb in calltrap () at ../../../i386/i386/exception.s:166 #10 0x40638a0a in kdb_enter_why (why=0x4082e157 "panic", msg=0x4082e157 "panic") at cpufunc.h:71 #11 0x4060dde1 in panic (fmt=0x4082c641 "mutex %s not owned at %s:%d") at ../../../kern/kern_shutdown.c:642 #12 0x405fe5b7 in _mtx_assert (m=0x6011c17c, what=4, file=0x4083bf94 "../../../net/route.c", line=810) at ../../../kern/kern_mutex.c:647 ---Type to continue, or q to quit--- #13 0x406b196c in rtexpunge (rt=0x614728b8) at ../../../net/route.c:810 #14 0x406f7eef in arplookup (addr=Variable "addr" is not available. ) at ../../../netinet/if_ether.c:1198 #15 0x406f90f7 in arpintr (m=0x6032e300) at ../../../netinet/if_ether.c:846 #16 0x406af233 in netisr_dispatch (num=18, m=0x6032e300) at ../../../net/netisr.c:185 #17 0x406aa8d3 in ether_demux (ifp=0x5ff87c00, m=0x6032e300) at ../../../net/if_ethersubr.c:1130 #18 0x406ab4ff in ether_input (ifp=0x5ff87c00, m=0x6032e300) at ../../../net/if_ethersubr.c:972 #19 0x406aed82 in vlan_input (ifp=0x5fbd4c00, m=0x6032e300) at ../../../net/if_vlan.c:1067 #20 0x406ab32f in ether_input (ifp=0x5fbd4c00, m=0x6032e300) at ../../../net/if_ethersubr.c:788 #21 0x404cada0 in em_rxeof (rxr=0x5fdc4500, count=12, done=0x0) at ../../../dev/e1000/if_em.c:4315 #22 0x404cb138 in em_poll (ifp=0x5fbd4c00, queue=0x0, cmd=POLL_ONLY, count=32) at ../../../dev/e1000/if_em.c:1403 #23 0x40601d3c in handle_pollng (context=0x0, pending=6) at ../../../kern/kern_poll.c:229 #24 0x40643c5b in taskqueue_run (queue=0x60199e80) at ../../../kern/subr_taskqueue.c:282 #25 0x40643db8 in taskqueue_thread_loop (arg=0x408daa9c) at ../../../kern/subr_taskqueue.c:401 ---Type to continue, or q to quit--- #26 0x405e83a8 in fork_exit (callout=0x40643d50 , arg=0x408daa9c, frame=0x53fd3d28) at ../../../kern/kern_fork.c:815 #27 0x407b7e70 in fork_trampoline () at ../../../i386/i386/exception.s:271 (kgdb) print *rt $24 = { rt_nodes = {{ rn_mklist = 0x0, rn_parent = 0x61472854, rn_bit = -1, rn_bmask = 0 '\0', rn_flags = 4 '\004', rn_u = { rn_leaf = { rn_Key = 0x6170a980 "\020\002", rn_Mask = 0x0, rn_Dupedkey = 0x0 }, rn_node = { rn_Off = 1634773376, rn_L = 0x0, rn_R = 0x0 } } }, { rn_mklist = 0x0, rn_parent = 0x6147256c, rn_bit = 53, ---Type to continue, or q to quit--- rn_bmask = 4 '\004', rn_flags = 4 '\004', rn_u = { rn_leaf = { rn_Key = 0x6
, rn_Mask = 0x61472094 "", rn_Dupedkey = 0x61472110 }, rn_node = { rn_Off = 6, rn_L = 0x61472094, rn_R = 0x61472110 } } }}, rt_gateway = 0x6170a990, rt_flags = 131077, rt_ifp = 0x5ff87c00, rt_ifa = 0x6144d700, rt_rmx = { rmx_mtu = 1500, rmx_expire = 0, rmx_pksent = 28 ---Type to continue, or q to quit--- }, rt_refcnt = 1, rt_genmask = 0x0, rt_llinfo = 0x0, rt_gwroute = 0x0, rt_parent = 0x61472e0c, rt_fibnum = 0, rt_mtx = { lock_object = { lo_name = 0x4083344c "rtentry", lo_type = 0x4083344c "rtentry", lo_flags = 21168128, lo_witness_data = { lod_list = { stqe_next = 0x408ebff0 }, lod_witness = 0x408ebff0 } }, mtx_lock = 1612345344, mtx_recurse = 0 } } >How-To-Repeat: Limit the number of arp entries to a low number (50) and do an "ARP flood" with nmap for example: nmap -TInsane --min-parallelism=1000 -sP -PS 10.0.0.0/8 (hope you are on a "big" network) >Fix: As a workaround I have created a rtexpunge1() function which locks the head of the radix node tree before calling rtexpunge(). >Release-Note: >Audit-Trail: >Unformatted: