From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 05:05:29 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 770F890C for ; Tue, 4 Feb 2014 05:05:29 +0000 (UTC) Received: from mail-oa0-x22f.google.com (mail-oa0-x22f.google.com [IPv6:2607:f8b0:4003:c02::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3C04D121E for ; Tue, 4 Feb 2014 05:05:29 +0000 (UTC) Received: by mail-oa0-f47.google.com with SMTP id m1so9188241oag.20 for ; Mon, 03 Feb 2014 21:05:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=G1+Nmm05d6xTX2FLAhc6HZtaoIKGnQp9vS0i+/m9uhE=; b=JXmrs9C8nCkObQQLNgN80FSBWpc/TlOe8jFOpz2buymt8R0zIfNLGty3c3Ah/gppIA 3BydX/U9pIKHjN/BXJ50k9x1NDlNak6XBywiazExlI5ZMju5P8N4+c9mAT7b1PvgmzUn 8wlJg3bTuHN/7qW7c6JSUmZM4dFLHhPu497GP81ZuBEnWZt1ksh+cYYmFrkAzVgpyRQm LYQgpZ2fU0xey9MWcyLAbF1PuMvgdxrNDBTgtzarLjrzm6HrVtxuKKeJNZFZ/RyCLxXY QUSzlgvp1Osud8eivm7O8unChojUzaAL+ETu/ims35K7SolQVS7nFVUiVpTyW58LeE1f CKYQ== MIME-Version: 1.0 X-Received: by 10.182.40.201 with SMTP id z9mr5428468obk.45.1391490328383; Mon, 03 Feb 2014 21:05:28 -0800 (PST) Received: by 10.182.74.4 with HTTP; Mon, 3 Feb 2014 21:05:28 -0800 (PST) In-Reply-To: <20140204055229.4a52ec15@x23.lan> References: <20140204055229.4a52ec15@x23.lan> Date: Mon, 3 Feb 2014 21:05:28 -0800 Message-ID: Subject: Re: vnet deletion panic From: Vijay Singh To: Marko Zec Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 05:05:29 -0000 Hi Marco, the code in rt_ifmsg() checks what seems like global state, so its not routing sockets in the vnet being destroyed. rt_ifmsg(struct ifnet *ifp) { struct if_msghdr *ifm; struct mbuf *m; struct rt_addrinfo info; if (route_cb.any_count == 0) return; You are right, there is no ifp context in rt_dispatch(). So perhaps we should not call rt_ifmsg() from if_unroute() is (ifp == V_loif) since that would end up using the soon to be destroyed ifp in the mbuf. What do you think? On Mon, Feb 3, 2014 at 8:52 PM, Marko Zec wrote: > On Mon, 3 Feb 2014 19:33:21 -0800 > Vijay Singh wrote: > > > I'm running into a crash due on vnet deletion in the presence of > > routing sockets. The root cause seems to originate from(): > > > > if_detach_internal() -> if_down(ifp) -> if_unroute() -> rt_ifmsg() -> > > rt_dispatch() > > > > In rt_dispatch() we have: > > > > #ifdef VIMAGE > > if (V_loif) > > m->m_pkthdr.rcvif = V_loif; > > #endif > > netisr_queue(NETISR_ROUTE, m); > > > > Now since this would be processed async, and the ifp alove is the > > loopback of the vnet being deleted, we run into accessing a freed > > pointer (ifp) when netisr picks up the mbuf. So I am wondering how to > > fix this. I am thinking that we could do something like the following > > in rt_dispatch(): > > > > #ifdef VIMAGE > > if (V_loif) { > > if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) { > > CURVNET_SET_QUIET(vnet0); > > m->m_pkthdr.rcvif = V_loif; > > CURVNET_RESTORE(); > > } else > > m->m_pkthdr.rcvif = V_loif; > > } > > #endif > > > > So basically switch to the default vnet for the mbuf with the routing > > socket message. Thoughts? > > By design, the vnet teardown procedure should not commence before the > last socket attached to that vnet is closed, so I'm suspicious whether > the proposed approach could actually appease the panics you're > observing. Furthermore, it would certainly cause bogus routing messages > to appear in vnet0 and possibly confuse routing socket consumers > running there. Plus, in rt_dispatch() there's no ifp context to check > against V_loif at all, as you're proposing your patch? > > Perhaps it could be possible to walk through all the netisr queues just > before V_loif gets destroyed, and prune all queued mbufs which have > m->m_pkthdr.rcvif pointing to V_loif? Since the vnet teardown procedure > cannot be initiated before all (routing) sockets attached to that vnet > have been closed, after all other ifnets except V_loif have also been > destroyed it should not be possible for new mbufs to be queued with > rcvif pointing back to V_loif, so at least conceptually that approach > might work correctly. > > Marko >