Date: Mon, 3 Feb 2014 21:05:28 -0800 From: Vijay Singh <vijju.singh@gmail.com> To: Marko Zec <zec@fer.hr> Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org> Subject: Re: vnet deletion panic Message-ID: <CALCNsJTsCesABH=Zm5ud4-SktwZg1pNt3cwCC6uu=Z0znspBug@mail.gmail.com> In-Reply-To: <20140204055229.4a52ec15@x23.lan> References: <CALCNsJQSfqyXUuiGUPwmuXH3OCdmMRVSZtZSDQEBTb9csQAe4Q@mail.gmail.com> <20140204055229.4a52ec15@x23.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Marco, the code in rt_ifmsg() checks what seems like global state, so its not routing sockets in the vnet being destroyed. rt_ifmsg(struct ifnet *ifp) { struct if_msghdr *ifm; struct mbuf *m; struct rt_addrinfo info; if (route_cb.any_count == 0) return; You are right, there is no ifp context in rt_dispatch(). So perhaps we should not call rt_ifmsg() from if_unroute() is (ifp == V_loif) since that would end up using the soon to be destroyed ifp in the mbuf. What do you think? On Mon, Feb 3, 2014 at 8:52 PM, Marko Zec <zec@fer.hr> wrote: > On Mon, 3 Feb 2014 19:33:21 -0800 > Vijay Singh <vijju.singh@gmail.com> wrote: > > > I'm running into a crash due on vnet deletion in the presence of > > routing sockets. The root cause seems to originate from(): > > > > if_detach_internal() -> if_down(ifp) -> if_unroute() -> rt_ifmsg() -> > > rt_dispatch() > > > > In rt_dispatch() we have: > > > > #ifdef VIMAGE > > if (V_loif) > > m->m_pkthdr.rcvif = V_loif; > > #endif > > netisr_queue(NETISR_ROUTE, m); > > > > Now since this would be processed async, and the ifp alove is the > > loopback of the vnet being deleted, we run into accessing a freed > > pointer (ifp) when netisr picks up the mbuf. So I am wondering how to > > fix this. I am thinking that we could do something like the following > > in rt_dispatch(): > > > > #ifdef VIMAGE > > if (V_loif) { > > if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) { > > CURVNET_SET_QUIET(vnet0); > > m->m_pkthdr.rcvif = V_loif; > > CURVNET_RESTORE(); > > } else > > m->m_pkthdr.rcvif = V_loif; > > } > > #endif > > > > So basically switch to the default vnet for the mbuf with the routing > > socket message. Thoughts? > > By design, the vnet teardown procedure should not commence before the > last socket attached to that vnet is closed, so I'm suspicious whether > the proposed approach could actually appease the panics you're > observing. Furthermore, it would certainly cause bogus routing messages > to appear in vnet0 and possibly confuse routing socket consumers > running there. Plus, in rt_dispatch() there's no ifp context to check > against V_loif at all, as you're proposing your patch? > > Perhaps it could be possible to walk through all the netisr queues just > before V_loif gets destroyed, and prune all queued mbufs which have > m->m_pkthdr.rcvif pointing to V_loif? Since the vnet teardown procedure > cannot be initiated before all (routing) sockets attached to that vnet > have been closed, after all other ifnets except V_loif have also been > destroyed it should not be possible for new mbufs to be queued with > rcvif pointing back to V_loif, so at least conceptually that approach > might work correctly. > > Marko >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CALCNsJTsCesABH=Zm5ud4-SktwZg1pNt3cwCC6uu=Z0znspBug>