From owner-freebsd-net@FreeBSD.ORG Thu Jul 14 15:44:59 2011 Return-Path: Delivered-To: net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 24DAF106566C; Thu, 14 Jul 2011 15:44:59 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id B4C088FC1B; Thu, 14 Jul 2011 15:44:58 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.4/8.14.4) with ESMTP id p6EFivqa017624; Thu, 14 Jul 2011 19:44:57 +0400 (MSD) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.4/8.14.4/Submit) id p6EFivoe017623; Thu, 14 Jul 2011 19:44:57 +0400 (MSD) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Thu, 14 Jul 2011 19:44:57 +0400 From: Gleb Smirnoff To: bz@FreeBSD.org, rwatson@FreeBSD.org, gnn@FreeBSD.org Message-ID: <20110714154457.GI70776@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Cc: net@FreeBSD.org Subject: m_pkthdr.rcvif dangling pointer problem X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jul 2011 15:44:59 -0000 Hi! This problem is definitely known and is as old as network stack is parallel. Those, who know the problem may skip to next paragraph. Short description is the following: every mbuf that is allocated in an interface receive procedure has a field where a pointer to the corresponding struct ifnet is stored. Everything is okay, until you are running a box, where configuration of interfaces is changing rapidly, for example a PPP access concentrator. Utilizing any facility that can delay packet processing for example netgraph(4) or dummynet(4) would make probability of crash much bigger. Running INVARIANTS kernel would crash a box with dummynet and disappearing interfaces in a few minutes. I see three approaches to this problem: 1) Every m_pkthdr.rcvif should if_ref() the interface. Releasing references can be done in the mbuf allocator: mb_dtor_mbuf(), mb_dtor_pack(). I'm afraid this approach would degrate performance, since adding at least a couple of atomic ops on every mbuf for its lifetime. That is +2 atomic ops per packet. 2) kib@ suggested to allocate ifnets from a UMA_ZONE_NOFREE zone. I've made a compilable & working patch: http://people.freebsd.org/~glebius/patches/ifnet.no_free But on second though I find this a bad idea, this is just fooling of INVARIANTS. Yes, we avoid thrashing of freed memory and rewriting it by some other kernel allocation. But still out pointer point to invalid ifnet. Even, if we make a check for IFF_DYING flag, we still can not guarantee that an interface had been re-allocated for a new instance. This would be not a panic condition, but subtle bugs in firewalls. 3) As we now have a straight if_index table that can grow, what about storing the if_index in the m_pkthdr? Lookup of interface by index is fast enough if done lockless. Doing it lockless isn't perfect, but better than current pointer dereferncing. Optionally it could be done with locking and with putting a reference. To avoid situation with with getting to a re-allocated interface with the same index, we can use a unique cookie, that is incremented in if_alloc(). Smth like: struct ifnet * mbuf_rcvif(struct mbuf *m) { struct ifnet *ifp; M_ASSERTPKTHDR(m); if (m->m_pkthdr.rcvif_idx == 0) return (NULL); ifp = ifnet_byindex_locked(m->m_pkthdr.rcvif_idx); if (ifp == NULL) return (NULL); if (ifp->if_unique != m->m_pkthdr.rcvif_unique) return (NULL); return (ifp); } struct ifnet * mbuf_rcvif_ref(struct mbuf *m) { struct ifnet *ifp; M_ASSERTPKTHDR(m); if (m->m_pkthdr.rcvif_idx == 0) return (NULL); ifp = ifnet_byindex_ref(m->m_pkthdr.rcvif_idx); if (ifp == NULL) return (NULL); if (ifp->if_unique != m->m_pkthdr.rcvif_unique) { if_rele(ifp); return (NULL); } return (ifp); } The size of struct mbuf isn't going to change on amd64: @@ -111,7 +111,8 @@ * Record/packet header in first mbuf of chain; valid only if M_PKTHDR is set. */ struct pkthdr { - struct ifnet *rcvif; /* rcv interface */ + uint32_t rcvif_idx; /* rcv interface index */ + uint32_t rcvif_unique; /* rcv interface unique id */ A proof-of-concept patch is available here: http://people.freebsd.org/~glebius/patches/pkthdr.rcvif_idx It doesn't cover entire kernel, LINT won't compile. It covers kern, netinet, netinet6, several interface drivers and netgraph. One of the problems is ongoing namespace pollution: not all code that include mbuf.h includes if_var.h and vice versa. I've cut ifnet from m_devget(), but my new function do declare it in parameter list :(. To deal with that I had to declare functions in mbuf.h but implement them in if.c. Other problem is net80211 code that abuses the rcvif pointer in mbuf packet header for its own purposes casting it to private type. This can be fixed utilizing mbuf_tags(9), I think. I haven't made this yet, that's why LINT doesn't compile. Comments are requested! Any alternative ideas on solving the problem are welcome! -- Totus tuus, Glebius.