Date: Wed, 6 Jun 2007 15:12:21 +0400 From: Gleb Smirnoff <glebius@FreeBSD.org> To: Kris Kennaway <kris@obsecurity.org>, Mehul Vora <mehul_freebsd@yahoo.com> Cc: freebsd-net@FreeBSD.org Subject: Re: panic: mtx_lock() of destroyed mutex @ ../../../net/route.c:1306 Message-ID: <20070606111221.GI89017@FreeBSD.org> In-Reply-To: <93263.30264.qm@web63604.mail.re1.yahoo.com> <20070502182454.GA41598@xor.obsecurity.org> References: <93263.30264.qm@web63604.mail.re1.yahoo.com> <20070502182454.GA41598@xor.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Dzs2zDY0zgkG72+7 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Kris, Mehul, I think this patch (that is still a WIP) can cure your problems, though you observe different problems in this rt_check(). I'd appreciate review and testing. On Wed, May 02, 2007 at 02:24:54PM -0400, Kris Kennaway wrote: K> One of my 7.0 systems has a flaky gateway, and when it goes down the K> node often goes down with this panic: K> K> panic: mtx_lock() of destroyed mutex @ ../../../net/route.c:1306 K> cpuid = 0 K> KDB: enter: panic K> [thread pid 28619 tid 100074 ] K> Stopped at kdb_enter+0x68: ta %xcc, 1 K> db> wh K> Tracing pid 28619 tid 100074 td 0xfffff800140e87e0 K> panic() at panic+0x248 K> _mtx_lock_flags() at _mtx_lock_flags+0x8c K> rt_check() at rt_check+0x128 K> arpresolve() at arpresolve+0x98 K> ether_output() at ether_output+0x94 K> ip_output() at ip_output+0xc64 K> udp_output() at udp_output+0x680 K> udp_send() at udp_send+0x38 K> sosend_dgram() at sosend_dgram+0x3e0 K> sosend() at sosend+0x74 K> kern_sendit() at kern_sendit+0x14c K> sendit() at sendit+0x1d4 K> sendto() at sendto+0x48 K> syscall() at syscall+0x2f8 K> -- syscall (133, FreeBSD ELF64, sendto) %o7=0x40aa68ac -- K> K> I suspect locking is broken in an error case. net/route.c:1306 is in K> the senderr() macro in rt_check(): K> K> /* XXX BSD/OS checks dst->sa_family != AF_NS */ K> if (rt->rt_flags & RTF_GATEWAY) { K> if (rt->rt_gwroute == NULL) K> goto lookup; K> rt = rt->rt_gwroute; K> bewm --> RT_LOCK(rt); /* NB: gwroute */ K> if ((rt->rt_flags & RTF_UP) == 0) { K> rtfree(rt); /* unlock gwroute */ K> rt = rt0; K> Kris On Mon, May 07, 2007 at 07:52:32AM -0700, Mehul Vora wrote: M> Hi, M> M> Current implementation (Version 6.2) of rt_check() routine defined in route.c is not completely MPSAFE. I found an issue when i started routing with "directisr" enabled. For the first rcvd packet this function initializes rt_gateway of the passed rt_entry. This is done by calling "rtalloc1" routine. But "rt_check" function doesnt hold any lock while calling this function. So incase if we have multiple instances of "ip_input - netisr" running than more than one thread can call this routine which may lead to some corruption, in my case it leads to a dead lock. Problem doesn't happen if before sending heavy traffic a single packet of same kind is sent. But if initially itself heavy traffic is sent than this happens immediately. I have fixed this and it works well after it. Workaround patch for this issue is attached here with. Probably we need to define a macro in route.h for the hardcoded values in the patch. Can any one confirm this ? M> M> Thanks, M> Mehul. M> M> M> --------------------------------- M> Sucker-punch spam with award-winning protection. M> Try the free Yahoo! Mail Beta. Content-Description: 206142780-rt_check.patch.txt M> 1260a1261 M> > try_again: M> 1280a1282,1289 M> > M> > if(rt0->rt_flags & 0x80000000U){ M> > /*This rt is under process...*/ M> > RT_UNLOCK(rt); M> > RT_UNLOCK(rt0); M> > goto try_again; M> > } M> > M> 1281a1291 M> > rt0->rt_flags |= 0x80000000U; M> 1288a1299 M> > rt0->rt_flags &= (~0x80000000U); M> _______________________________________________ M> freebsd-net@freebsd.org mailing list M> http://lists.freebsd.org/mailman/listinfo/freebsd-net M> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE --Dzs2zDY0zgkG72+7 Content-Type: text/x-diff; charset=koi8-r Content-Disposition: attachment; filename="route.c.diff" Index: route.c =================================================================== RCS file: /home/ncvs/src/sys/net/route.c,v retrieving revision 1.119 diff -u -p -r1.119 route.c --- route.c 22 May 2007 16:17:31 -0000 1.119 +++ route.c 23 May 2007 11:48:14 -0000 @@ -392,6 +392,14 @@ rtredirect(struct sockaddr *dst, */ rt_setgate(rt, rt_key(rt), gateway); } + + KASSERT(rt->rt_gateway != NULL, + ("RTF_GATEWAY and rt_gateway is NULL")); + /* Set up rt_gwroute. */ + rt->rt_gwroute = rtalloc1(rt->rt_gateway, 1, 0UL); + KASSERT(rt != rt->rt_gwroute, ("Oops")); + if (rt->rt_gwroute != NULL) + RT_UNLOCK(rt->rt_gwroute); } else error = EHOSTUNREACH; done: @@ -1295,32 +1303,10 @@ rt_check(struct rtentry **lrt, struct rt return (EHOSTUNREACH); rt0 = rt; } - /* XXX BSD/OS checks dst->sa_family != AF_NS */ - if (rt->rt_flags & RTF_GATEWAY) { - if (rt->rt_gwroute == NULL) - goto lookup; - rt = rt->rt_gwroute; - RT_LOCK(rt); /* NB: gwroute */ - if ((rt->rt_flags & RTF_UP) == 0) { - RTFREE_LOCKED(rt); /* unlock gwroute */ - rt = rt0; - lookup: - RT_UNLOCK(rt0); - rt = rtalloc1(rt->rt_gateway, 1, 0UL); - if (rt == rt0) { - rt0->rt_gwroute = NULL; - RT_REMREF(rt0); - RT_UNLOCK(rt0); - return (ENETUNREACH); - } - RT_LOCK(rt0); - rt0->rt_gwroute = rt; - if (rt == NULL) { - RT_UNLOCK(rt0); - return (EHOSTUNREACH); - } - } - RT_UNLOCK(rt0); + if (rt->rt_flags & RTF_GATEWAY && (rt->rt_gwroute == NULL || + (rt->rt_gwroute->rt_flags & RTF_UP) == 0)) { + RT_UNLOCK(rt); + return (EHOSTUNREACH); } /* XXX why are we inspecting rmx_expire? */ error = (rt->rt_flags & RTF_REJECT) && --Dzs2zDY0zgkG72+7--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070606111221.GI89017>