Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Oct 2012 23:17:00 +0200
From:      Marko Zec <zec@fer.hr>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        freebsd-net@freebsd.org, freebsd-hackers@freebsd.org
Subject:   Re: VIMAGE crashes on 9.x with hotplug net80211 devices
Message-ID:  <201210222317.00457.zec@fer.hr>
In-Reply-To: <CAJ-Vmo=qsNAJxsUBBs9fpUZ-qHjVC79G0wy8OhRJjMpfFgzHqg@mail.gmail.com>
References:  <CAJ-VmomchJZ7GUKrAjmmyBXDw-6H6O5fAxT_tfAFfhU=HknG1g@mail.gmail.com> <5085827D.5090108@freebsd.org> <CAJ-Vmo=qsNAJxsUBBs9fpUZ-qHjVC79G0wy8OhRJjMpfFgzHqg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 22 October 2012 19:41:19 Adrian Chadd wrote:
> On 22 October 2012 10:29, Julian Elischer <julian@freebsd.org> wrote:
> >> The trouble is going to be handling unplug and kldunload events too.
> >> Does curvnet -> vnet0 during kldunload events?
> >
> > I think in unload events we probably need to cycle through all vnets
> > and do individual shutdowns of  anything that is set up on that vnet..
> > (but I'm not reading the code to say that, it's possible to ignore me
> > safely)
>
> Well, in an unload event you know the device you're unloading.
> However, there may be clones and such involved. It's not like a
> kldunload will kill a specific VAP on an ath(4) interface, it'll kill
> the whole interface with all vaps.
>
> So in net80211 I need to teach the VAP setup/destroy path to use
> CURVNET_*() correctly. That's a given.
>
> I still however need to ensure that
> CURVNET_SET(vnet0)/CURVNET_RESTORE() is used around the device
> attach/detach, as right now the hotplug code doesn't do this.
>
> So Marko:
>
> * Given that you've "fixed" the kldload path and bootup path to set
> CURVNET_SET(vnet0) as a special case, how about we teach the
> device_attach() path to just do this in general?

While it's true that the kldunload path (most probably) does 
CURVNET_SET(vnet0), this is obviously just a kludge which works on pure 
luck, i.e. only when ifnets to be detached live inside vnet0.

> * How does kldunload work right now if any devices are in a vnet?

It (most probably) doesn't.

> If I 
> kldunload if_bridge with vnets everywhere, what happens? if_bridge
> doesn't at all know anything about VIMAGE. How do the cloned
> interfaces get correctly destroyed?

Haven't tried this out recently, really, though bz@ maintained a patch for a 
while which specifically targetted VNET issues with cloner ifnets, but I 
don't know the current status of that work...

> I don't want to have to teach _every network device_ that they need to
> be vnet aware on attach or detach.
>
> * the device probe/attach path should just use vnet0; and

Right.

> * the device detach/destroy path, to things like if_free(), should
> have those functions just use ifp->if_vnet, rather than assuming
> CURVNET_SET() was called.

How many functions like if_free() are we talking about here?  If only a few 
would need to be extended to do a CURVNET_SET(ifp->if_vnet), that doesn't 
sound like too big an issue, though I'm not completely convinced that such 
an approach could guarantee that every driver would survive hotunplugging 
with vnets.  Still, that would be an improvement over what we have right 
now.

> I know you wanted to be warned if parts of the stack weren't correctly
> using CURVNET_SET()/CURVNET_RESTORE(), but I think this battle is
> already lost. :/

It is absolutely critical that, at minimum, we always completely unwind the 
VNET stack when exiting the networking code, otherwise we risk to continue 
running with a fully random implicit curvnet context.  As many of the 
networking subsystems or code paths are still not VNET-friendly, entering 
any of those on a VIMAGE kernel should lead to panics, not to obscure and 
silent inter-vnet leakages which may become a nightmare to nail down.

OTOH, avoiding excessive recursions on curvnet remains an effort similar to 
our style(9) - if you don't stick to it to the letter, things will still 
work, but some code paths may become more difficult to debug when things go 
wrong...  Plus, keep in mind that every CURVNET_SET() consumes a few CPU 
cycles here and there, and requires a few extra bytes on the stack...

Marko



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201210222317.00457.zec>