Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Jul 2011 21:51:00 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Gleb Smirnoff <glebius@freebsd.org>
Cc:        gnn@freebsd.org, bz@freebsd.org, rwatson@freebsd.org, net@freebsd.org
Subject:   Re: m_pkthdr.rcvif dangling pointer problem
Message-ID:  <20110714185100.GC43872@deviant.kiev.zoral.com.ua>
In-Reply-To: <20110714154457.GI70776@FreeBSD.org>
References:  <20110714154457.GI70776@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--dmVtbNsdrXqYZXhl
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jul 14, 2011 at 07:44:57PM +0400, Gleb Smirnoff wrote:
>   Hi!
>=20
>   This problem is definitely known and is as old as network stack
> is parallel. Those, who know the problem may skip to next paragraph.
> Short description is the following: every mbuf that is allocated in
> an interface receive procedure has a field where a pointer to the
> corresponding struct ifnet is stored. Everything is okay, until you
> are running a box, where configuration of interfaces is changing
> rapidly, for example a PPP access concentrator. Utilizing any facility
> that can delay packet processing for example netgraph(4) or dummynet(4)
> would make probability of crash much bigger. Running INVARIANTS kernel
> would crash a box with dummynet and disappearing interfaces in a few
> minutes.
>=20
>   I see three approaches to this problem:
>=20
>   1) Every m_pkthdr.rcvif should if_ref() the interface. Releasing
> references can be done in the mbuf allocator: mb_dtor_mbuf(),
> mb_dtor_pack(). I'm afraid this approach would degrate performance,
> since adding at least a couple of atomic ops on every mbuf for its
> lifetime. That is +2 atomic ops per packet.
>=20
>   2) kib@ suggested to allocate ifnets from a UMA_ZONE_NOFREE zone.
> I've made a compilable & working patch:
>=20
> http://people.freebsd.org/~glebius/patches/ifnet.no_free
>=20
> But on second though I find this a bad idea, this is just fooling
> of INVARIANTS. Yes, we avoid thrashing of freed memory and rewriting
> it by some other kernel allocation. But still out pointer point to
> invalid ifnet. Even, if we make a check for IFF_DYING flag, we still
> can not guarantee that an interface had been re-allocated for a new
> instance. This would be not a panic condition, but subtle bugs in
> firewalls.
This is not quite what I suggested (not looking at the patch).
I also noted that some measures to prevent the dereference of outdated
pointers should be taken. We discussed the generation counter for the
ifnets, and I pointed out that m_hdr has at least two MI scratch bytes.
Also, on amd64, there seems to be one spare word in pkthdr dur to
alignment.

>=20
>   3) As we now have a straight if_index table that can grow, what about
> storing the if_index in the m_pkthdr? Lookup of interface by index
> is fast enough if done lockless. Doing it lockless isn't perfect, but
> better than current pointer dereferncing. Optionally it could be
> done with locking and with putting a reference. To avoid situation
> with with getting to a re-allocated interface with the same index,
> we can use a unique cookie, that is incremented in if_alloc().
> Smth like:
>=20
> struct ifnet *=20
> mbuf_rcvif(struct mbuf *m)
> {
>         struct ifnet *ifp;
>   =20
>         M_ASSERTPKTHDR(m);
>=20
>         if (m->m_pkthdr.rcvif_idx =3D=3D 0)
>                 return (NULL);
>=20
>         ifp =3D ifnet_byindex_locked(m->m_pkthdr.rcvif_idx);
>         if (ifp =3D=3D NULL)
>                 return (NULL);
>         if (ifp->if_unique !=3D m->m_pkthdr.rcvif_unique)
>                 return (NULL);
>=20
>         return (ifp); =20
> }
>=20
> struct ifnet *
> mbuf_rcvif_ref(struct mbuf *m)
> {
>         struct ifnet *ifp;
>=20
>         M_ASSERTPKTHDR(m);
>=20
>         if (m->m_pkthdr.rcvif_idx =3D=3D 0)
>                 return (NULL);
>=20
>         ifp =3D ifnet_byindex_ref(m->m_pkthdr.rcvif_idx);
>         if (ifp =3D=3D NULL)
>                 return (NULL);
>         if (ifp->if_unique !=3D m->m_pkthdr.rcvif_unique) {
>                 if_rele(ifp);
>                 return (NULL);
>         }
>=20
>         return (ifp);
> }
>=20
> The size of struct mbuf isn't going to change on amd64:
>=20
> @@ -111,7 +111,8 @@
>   * Record/packet header in first mbuf of chain; valid only if M_PKTHDR i=
s set.
>   */
>  struct pkthdr {
> -       struct ifnet    *rcvif;         /* rcv interface */
> +       uint32_t         rcvif_idx;     /* rcv interface index */
> +       uint32_t         rcvif_unique;  /* rcv interface unique id */
>=20
> A proof-of-concept patch is available here:
>=20
> http://people.freebsd.org/~glebius/patches/pkthdr.rcvif_idx
>=20
> It doesn't cover entire kernel, LINT won't compile. It covers kern, netin=
et,
> netinet6, several interface drivers and netgraph.
>=20
> One of the problems is ongoing namespace pollution: not all code that inc=
lude
> mbuf.h includes if_var.h and vice versa. I've cut ifnet from m_devget(), =
but
> my new function do declare it in parameter list :(. To deal with that I h=
ad
> to declare functions in mbuf.h but implement them in if.c.
>=20
> Other problem is net80211 code that abuses the rcvif pointer in mbuf pack=
et
> header for its own purposes casting it to private type. This can be fixed
> utilizing mbuf_tags(9), I think. I haven't made this yet, that's why LINT
> doesn't compile.
>=20
>   Comments are requested!
>=20
>   Any alternative ideas on solving the problem are welcome!
>=20
> --=20
> Totus tuus, Glebius.
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

--dmVtbNsdrXqYZXhl
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAk4fOpQACgkQC3+MBN1Mb4hp9ACgslS4Th8L+s8f8gloH/F2sxG0
Y8MAn0JDKZ6d64Sh0HuBHltdMBS2kcza
=a3yG
-----END PGP SIGNATURE-----

--dmVtbNsdrXqYZXhl--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110714185100.GC43872>