From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 05:27:01 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3AEABE88 for ; Tue, 4 Feb 2014 05:27:01 +0000 (UTC) Received: from mail.fer.hr (mail.fer.hr [161.53.72.233]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 909C61366 for ; Tue, 4 Feb 2014 05:27:00 +0000 (UTC) Received: from x23.lan (141.136.230.70) by MAIL.fer.hr (161.53.72.233) with Microsoft SMTP Server (TLS) id 14.2.342.3; Tue, 4 Feb 2014 06:26:58 +0100 Date: Tue, 4 Feb 2014 06:27:31 +0100 From: Marko Zec To: Vijay Singh Subject: Re: vnet deletion panic Message-ID: <20140204062731.2e9d4bc0@x23.lan> In-Reply-To: References: <20140204055229.4a52ec15@x23.lan> Organization: FER X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.19; amd64-portbld-freebsd9.1) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MP_/.h=lNEG+_Pdgcpx_45mRJX2" X-Originating-IP: [141.136.230.70] Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 05:27:01 -0000 --MP_/.h=lNEG+_Pdgcpx_45mRJX2 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Mon, 3 Feb 2014 21:05:28 -0800 Vijay Singh wrote: > Hi Marco, the code in rt_ifmsg() checks what seems like global state, > so its not routing sockets in the vnet being destroyed. Huh then we should V_irtualize that global state - maybe the attached patch could help (only compile-tested, perhaps you could try it out)? Marko > rt_ifmsg(struct ifnet *ifp) > { > struct if_msghdr *ifm; > struct mbuf *m; > struct rt_addrinfo info; > > if (route_cb.any_count == 0) > return; > > You are right, there is no ifp context in rt_dispatch(). So perhaps we > should not call rt_ifmsg() from if_unroute() is (ifp == V_loif) since > that would end up using the soon to be destroyed ifp in the mbuf. > What do you think? > > > On Mon, Feb 3, 2014 at 8:52 PM, Marko Zec wrote: > > > On Mon, 3 Feb 2014 19:33:21 -0800 > > Vijay Singh wrote: > > > > > I'm running into a crash due on vnet deletion in the presence of > > > routing sockets. The root cause seems to originate from(): > > > > > > if_detach_internal() -> if_down(ifp) -> if_unroute() -> > > > rt_ifmsg() -> rt_dispatch() > > > > > > In rt_dispatch() we have: > > > > > > #ifdef VIMAGE > > > if (V_loif) > > > m->m_pkthdr.rcvif = V_loif; > > > #endif > > > netisr_queue(NETISR_ROUTE, m); > > > > > > Now since this would be processed async, and the ifp alove is the > > > loopback of the vnet being deleted, we run into accessing a freed > > > pointer (ifp) when netisr picks up the mbuf. So I am wondering > > > how to fix this. I am thinking that we could do something like > > > the following in rt_dispatch(): > > > > > > #ifdef VIMAGE > > > if (V_loif) { > > > if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) { > > > CURVNET_SET_QUIET(vnet0); > > > m->m_pkthdr.rcvif = V_loif; > > > CURVNET_RESTORE(); > > > } else > > > m->m_pkthdr.rcvif = V_loif; > > > } > > > #endif > > > > > > So basically switch to the default vnet for the mbuf with the > > > routing socket message. Thoughts? > > > > By design, the vnet teardown procedure should not commence before > > the last socket attached to that vnet is closed, so I'm suspicious > > whether the proposed approach could actually appease the panics > > you're observing. Furthermore, it would certainly cause bogus > > routing messages to appear in vnet0 and possibly confuse routing > > socket consumers running there. Plus, in rt_dispatch() there's no > > ifp context to check against V_loif at all, as you're proposing > > your patch? > > > > Perhaps it could be possible to walk through all the netisr queues > > just before V_loif gets destroyed, and prune all queued mbufs which > > have m->m_pkthdr.rcvif pointing to V_loif? Since the vnet teardown > > procedure cannot be initiated before all (routing) sockets attached > > to that vnet have been closed, after all other ifnets except V_loif > > have also been destroyed it should not be possible for new mbufs to > > be queued with rcvif pointing back to V_loif, so at least > > conceptually that approach might work correctly. > > > > Marko > > --MP_/.h=lNEG+_Pdgcpx_45mRJX2 Content-Type: text/x-patch Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="v_route_cb.diff" Index: rtsock.c =================================================================== --- rtsock.c (revision 258393) +++ rtsock.c (working copy) @@ -156,12 +156,14 @@ #define RTS_FILTER_FIB M_PROTO8 #define RTS_ALLFIBS -1 -static struct { +typedef struct { int ip_count; /* attached w/ AF_INET */ int ip6_count; /* attached w/ AF_INET6 */ int ipx_count; /* attached w/ AF_IPX */ int any_count; /* total attached */ -} route_cb; +} route_cb_t; +static VNET_DEFINE(route_cb_t, route_cb); +#define V_route_cb VNET(route_cb) struct mtx rtsock_mtx; MTX_SYSINIT(rtsock, &rtsock_mtx, "rtsock route_cb lock", MTX_DEF); @@ -328,16 +330,16 @@ RTSOCK_LOCK(); switch(rp->rcb_proto.sp_protocol) { case AF_INET: - route_cb.ip_count++; + V_route_cb.ip_count++; break; case AF_INET6: - route_cb.ip6_count++; + V_route_cb.ip6_count++; break; case AF_IPX: - route_cb.ipx_count++; + V_route_cb.ipx_count++; break; } - route_cb.any_count++; + V_route_cb.any_count++; RTSOCK_UNLOCK(); soisconnected(so); so->so_options |= SO_USELOOPBACK; @@ -372,16 +374,16 @@ RTSOCK_LOCK(); switch(rp->rcb_proto.sp_protocol) { case AF_INET: - route_cb.ip_count--; + V_route_cb.ip_count--; break; case AF_INET6: - route_cb.ip6_count--; + V_route_cb.ip6_count--; break; case AF_IPX: - route_cb.ipx_count--; + V_route_cb.ipx_count--; break; } - route_cb.any_count--; + V_route_cb.any_count--; RTSOCK_UNLOCK(); raw_usrreqs.pru_detach(so); } @@ -928,7 +930,7 @@ * Check to see if we don't want our own messages. */ if ((so->so_options & SO_USELOOPBACK) == 0) { - if (route_cb.any_count <= 1) { + if (V_route_cb.any_count <= 1) { if (rtm) Free(rtm); m_freem(m); @@ -1217,7 +1219,7 @@ struct mbuf *m; struct sockaddr *sa = rtinfo->rti_info[RTAX_DST]; - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return; m = rt_msg1(type, rtinfo); if (m == NULL) @@ -1255,7 +1257,7 @@ struct mbuf *m; struct rt_addrinfo info; - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return; bzero((caddr_t)&info, sizeof(info)); m = rt_msg1(RTM_IFINFO, &info); @@ -1299,7 +1301,7 @@ sctp_addr_change(ifa, cmd); #endif /* SCTP */ #endif - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return; for (pass = 1; pass < 3; pass++) { bzero((caddr_t)&info, sizeof(info)); @@ -1368,7 +1370,7 @@ struct ifnet *ifp = ifma->ifma_ifp; struct ifma_msghdr *ifmam; - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return; bzero((caddr_t)&info, sizeof(info)); @@ -1397,7 +1399,7 @@ struct if_announcemsghdr *ifan; struct mbuf *m; - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return NULL; bzero((caddr_t)info, sizeof(*info)); m = rt_msg1(type, info); --MP_/.h=lNEG+_Pdgcpx_45mRJX2--