From owner-freebsd-net@FreeBSD.ORG Sun Jul 24 08:44:04 2011 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2945F106566C; Sun, 24 Jul 2011 08:44:04 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 037FC8FC12; Sun, 24 Jul 2011 08:44:04 +0000 (UTC) Received: from [192.168.2.112] (host86-148-225-194.range86-148.btcentralplus.com [86.148.225.194]) by cyrus.watson.org (Postfix) with ESMTPSA id 62A5C46B03; Sun, 24 Jul 2011 04:44:02 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: "Robert N. M. Watson" In-Reply-To: Date: Sun, 24 Jul 2011 09:43:59 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20110714154457.GI70776@FreeBSD.org> To: Ryan Stone X-Mailer: Apple Mail (2.1084) Cc: gnn@freebsd.org, bz@freebsd.org, net@freebsd.org Subject: Re: m_pkthdr.rcvif dangling pointer problem X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Jul 2011 08:44:04 -0000 On 24 Jul 2011, at 04:51, Ryan Stone wrote: > I ran headlong into this issue today when trying to test some network > stack changes. It's depressingly easy to crash HEAD by periodically > destroying vlan interfaces while you are sending traffic over those > interfaces -- I get a panic within minutes. >=20 >> http://people.freebsd.org/~glebius/patches/ifnet.no_free >=20 > This patch makes my test system survive longer but does not resolve = the issue. Unfortunately, I'm a bit preoccupied currently, so haven't had a chance = to follow up as yet, but just to follow up on the general issue here: = this problem existed pre-SMP as well, and could be easily triggered by = DUMMYNET and removable interfaces as well (as additional queuing delays = just make the problem worse). In general: we need a solution that = penalises interface removal, not common-case packet processing. As many = packets have their source ifnet looked up in common-case processing = (worth checking this assumption) because it's cheap, any solution that = causes an interface lookup on every input packet (with synchronisation) = is also an issue. Instead, I think we should go for a more radical notion, which is a bit = harder to implement in our stack: the network stack needs a race-free = way to "drain" all mbufs referring to a particular ifnet, which does not = cause existing processing to become more expensive. This is easy in some = subsystems, but more complex in others -- and the composition of = subsystems makes it all much harder since we need to know that (to be = 100% correct) packets aren't getting passed between subsystems (and = hence belong to neither) in a way that races with a sweep through the = subsystems. It may be possible to get this 99.9% right simply by = providing a series of callbacks into subsystems that cause queues to be = walked and drained of packets matching the doomed ifnet. It may also be = quite cheap to have subsystems that "hold" packets outside of explicit = queues for some period (i.e., in a thread-local pointer out of the = stack) add explicit invalidation tests (i.e., for IFF_DYING) before = handing off to prevent those packets from traversing into other = subsystems -- which can be done synchronisation-free, but still wouldn't = 100% prevent the race Just to give an example: netisr should offer a method for = netisr_drain_ifnet(struct ifnet *) that causes netisr to walk all of its = queues to find matching packets and free them. Due to direct dispatch = and thread-local queues during processing, netisr should also check = IFF_DYING before handing off. If we do that, I wonder how robust the system then becomes...? This may = not be too hard to test. But I'd rather we penalise ifnet removal than, = say, the IP input path when it needs to check a source interface = property. Robert=