Date: Wed, 6 Jun 2007 15:12:21 +0400 From: Gleb Smirnoff <glebius@FreeBSD.org> To: Kris Kennaway <kris@obsecurity.org>, Mehul Vora <mehul_freebsd@yahoo.com> Cc: freebsd-net@FreeBSD.org Subject: Re: panic: mtx_lock() of destroyed mutex @ ../../../net/route.c:1306 Message-ID: <20070606111221.GI89017@FreeBSD.org> In-Reply-To: <93263.30264.qm@web63604.mail.re1.yahoo.com> <20070502182454.GA41598@xor.obsecurity.org> References: <93263.30264.qm@web63604.mail.re1.yahoo.com> <20070502182454.GA41598@xor.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Dzs2zDY0zgkG72+7
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
Kris, Mehul,
I think this patch (that is still a WIP) can cure your problems,
though you observe different problems in this rt_check().
I'd appreciate review and testing.
On Wed, May 02, 2007 at 02:24:54PM -0400, Kris Kennaway wrote:
K> One of my 7.0 systems has a flaky gateway, and when it goes down the
K> node often goes down with this panic:
K>
K> panic: mtx_lock() of destroyed mutex @ ../../../net/route.c:1306
K> cpuid = 0
K> KDB: enter: panic
K> [thread pid 28619 tid 100074 ]
K> Stopped at kdb_enter+0x68: ta %xcc, 1
K> db> wh
K> Tracing pid 28619 tid 100074 td 0xfffff800140e87e0
K> panic() at panic+0x248
K> _mtx_lock_flags() at _mtx_lock_flags+0x8c
K> rt_check() at rt_check+0x128
K> arpresolve() at arpresolve+0x98
K> ether_output() at ether_output+0x94
K> ip_output() at ip_output+0xc64
K> udp_output() at udp_output+0x680
K> udp_send() at udp_send+0x38
K> sosend_dgram() at sosend_dgram+0x3e0
K> sosend() at sosend+0x74
K> kern_sendit() at kern_sendit+0x14c
K> sendit() at sendit+0x1d4
K> sendto() at sendto+0x48
K> syscall() at syscall+0x2f8
K> -- syscall (133, FreeBSD ELF64, sendto) %o7=0x40aa68ac --
K>
K> I suspect locking is broken in an error case. net/route.c:1306 is in
K> the senderr() macro in rt_check():
K>
K> /* XXX BSD/OS checks dst->sa_family != AF_NS */
K> if (rt->rt_flags & RTF_GATEWAY) {
K> if (rt->rt_gwroute == NULL)
K> goto lookup;
K> rt = rt->rt_gwroute;
K> bewm --> RT_LOCK(rt); /* NB: gwroute */
K> if ((rt->rt_flags & RTF_UP) == 0) {
K> rtfree(rt); /* unlock gwroute */
K> rt = rt0;
K> Kris
On Mon, May 07, 2007 at 07:52:32AM -0700, Mehul Vora wrote:
M> Hi,
M>
M> Current implementation (Version 6.2) of rt_check() routine defined in route.c is not completely MPSAFE. I found an issue when i started routing with "directisr" enabled. For the first rcvd packet this function initializes rt_gateway of the passed rt_entry. This is done by calling "rtalloc1" routine. But "rt_check" function doesnt hold any lock while calling this function. So incase if we have multiple instances of "ip_input - netisr" running than more than one thread can call this routine which may lead to some corruption, in my case it leads to a dead lock. Problem doesn't happen if before sending heavy traffic a single packet of same kind is sent. But if initially itself heavy traffic is sent than this happens immediately. I have fixed this and it works well after it. Workaround patch for this issue is attached here with. Probably we need to define a macro in route.h for the hardcoded values in the patch. Can any one confirm this ?
M>
M> Thanks,
M> Mehul.
M>
M>
M> ---------------------------------
M> Sucker-punch spam with award-winning protection.
M> Try the free Yahoo! Mail Beta.
Content-Description: 206142780-rt_check.patch.txt
M> 1260a1261
M> > try_again:
M> 1280a1282,1289
M> >
M> > if(rt0->rt_flags & 0x80000000U){
M> > /*This rt is under process...*/
M> > RT_UNLOCK(rt);
M> > RT_UNLOCK(rt0);
M> > goto try_again;
M> > }
M> >
M> 1281a1291
M> > rt0->rt_flags |= 0x80000000U;
M> 1288a1299
M> > rt0->rt_flags &= (~0x80000000U);
M> _______________________________________________
M> freebsd-net@freebsd.org mailing list
M> http://lists.freebsd.org/mailman/listinfo/freebsd-net
M> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
--
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
--Dzs2zDY0zgkG72+7
Content-Type: text/x-diff; charset=koi8-r
Content-Disposition: attachment; filename="route.c.diff"
Index: route.c
===================================================================
RCS file: /home/ncvs/src/sys/net/route.c,v
retrieving revision 1.119
diff -u -p -r1.119 route.c
--- route.c 22 May 2007 16:17:31 -0000 1.119
+++ route.c 23 May 2007 11:48:14 -0000
@@ -392,6 +392,14 @@ rtredirect(struct sockaddr *dst,
*/
rt_setgate(rt, rt_key(rt), gateway);
}
+
+ KASSERT(rt->rt_gateway != NULL,
+ ("RTF_GATEWAY and rt_gateway is NULL"));
+ /* Set up rt_gwroute. */
+ rt->rt_gwroute = rtalloc1(rt->rt_gateway, 1, 0UL);
+ KASSERT(rt != rt->rt_gwroute, ("Oops"));
+ if (rt->rt_gwroute != NULL)
+ RT_UNLOCK(rt->rt_gwroute);
} else
error = EHOSTUNREACH;
done:
@@ -1295,32 +1303,10 @@ rt_check(struct rtentry **lrt, struct rt
return (EHOSTUNREACH);
rt0 = rt;
}
- /* XXX BSD/OS checks dst->sa_family != AF_NS */
- if (rt->rt_flags & RTF_GATEWAY) {
- if (rt->rt_gwroute == NULL)
- goto lookup;
- rt = rt->rt_gwroute;
- RT_LOCK(rt); /* NB: gwroute */
- if ((rt->rt_flags & RTF_UP) == 0) {
- RTFREE_LOCKED(rt); /* unlock gwroute */
- rt = rt0;
- lookup:
- RT_UNLOCK(rt0);
- rt = rtalloc1(rt->rt_gateway, 1, 0UL);
- if (rt == rt0) {
- rt0->rt_gwroute = NULL;
- RT_REMREF(rt0);
- RT_UNLOCK(rt0);
- return (ENETUNREACH);
- }
- RT_LOCK(rt0);
- rt0->rt_gwroute = rt;
- if (rt == NULL) {
- RT_UNLOCK(rt0);
- return (EHOSTUNREACH);
- }
- }
- RT_UNLOCK(rt0);
+ if (rt->rt_flags & RTF_GATEWAY && (rt->rt_gwroute == NULL ||
+ (rt->rt_gwroute->rt_flags & RTF_UP) == 0)) {
+ RT_UNLOCK(rt);
+ return (EHOSTUNREACH);
}
/* XXX why are we inspecting rmx_expire? */
error = (rt->rt_flags & RTF_REJECT) &&
--Dzs2zDY0zgkG72+7--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070606111221.GI89017>
