Date: Mon, 25 Jul 2011 12:59:54 +0400 From: Gleb Smirnoff <glebius@FreeBSD.org> To: "Robert N. M. Watson" <rwatson@FreeBSD.org> Cc: gnn@FreeBSD.org, bz@FreeBSD.org, Ryan Stone <rysto32@gmail.com>, net@FreeBSD.org Subject: Re: m_pkthdr.rcvif dangling pointer problem Message-ID: <20110725085954.GR63969@glebius.int.ru> In-Reply-To: <E05FE767-1923-4D47-9759-FA040E403618@freebsd.org> References: <20110714154457.GI70776@FreeBSD.org> <CAFMmRNwBbxR-F7PjkwF8E4GjwFQy_0USKW-3u-ZRNxPMJSOQcA@mail.gmail.com> <E05FE767-1923-4D47-9759-FA040E403618@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jul 24, 2011 at 09:43:59AM +0100, Robert N. M. Watson wrote: R> Instead, I think we should go for a more radical notion, which is a bit = harder to implement in our stack: the network stack needs a race-free way t= o "drain" all mbufs referring to a particular ifnet, which does not cause e= xisting processing to become more expensive. This is easy in some subsystem= s, but more complex in others -- and the composition of subsystems makes it= all much harder since we need to know that (to be 100% correct) packets ar= en't getting passed between subsystems (and hence belong to neither) in a w= ay that races with a sweep through the subsystems. It may be possible to ge= t this 99.9% right simply by providing a series of callbacks into subsystem= s that cause queues to be walked and drained of packets matching the doomed= ifnet. It may also be quite cheap to have subsystems that "hold" packets o= utside of explicit queues for some period (i.e., in a thread-local pointer = out of the stack) add explicit invalidation tests (i.e., for IFF_DYING) be= fore handing off to prevent those packets from traversing into other subsys= tems -- which can be done synchronisation-free, but still wouldn't 100% pre= vent the race R>=20 R> Just to give an example: netisr should offer a method for netisr_drain_i= fnet(struct ifnet *) that causes netisr to walk all of its queues to find m= atching packets and free them. Due to direct dispatch and thread-local queu= es during processing, netisr should also check IFF_DYING before handing off. R>=20 R> If we do that, I wonder how robust the system then becomes...? This may = not be too hard to test. But I'd rather we penalise ifnet removal than, say= , the IP input path when it needs to check a source interface property. What if some thread (e.g. netisr) have taken an mbuf off the queue, stored = pointer to it on stack, and got rescheduled. Then an ifnet has departured, all call= backs were called, all queues cleaned up from pointers. Then a thread with that m= buf on=20 stack continues to process. ? --=20 Totus tuus, Glebius.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110725085954.GR63969>