Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Aug 2007 20:44:38 +0200
From:      Marko Zec <zec@icir.org>
To:        Julian Elischer <julian@elischer.org>
Cc:        Perforce Change Reviews <perforce@freebsd.org>
Subject:   Re: PERFORCE change 125169 for review
Message-ID:  <200708152044.38580.zec@icir.org>
In-Reply-To: <46C320B4.7070008@elischer.org>
References:  <200708151129.l7FBTiaU005370@repoman.freebsd.org> <46C320B4.7070008@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 15 August 2007 17:50, Julian Elischer wrote:
> Marko Zec wrote:
> > http://perforce.freebsd.org/chv.cgi?CH=125169
> >
> > Change 125169 by zec@zec_tpx32 on 2007/08/15 11:28:57
> >
> > 	Defer dispatching of netisr handlers for mbufs which have
> > 	crossed a boundary between two vnets.  Direct dispatching
> > 	in such cases could lead to various LORs, or in most
> > 	extreme circumstances cause the kernel stack to overflow.
> >
> > 	This is accomplished by the introduction of a new mbuf
> > 	flag, M_REMOTE_VNET, which must be set by any kernel entity
> > 	moving a mbuf from one vnet context to another.  So far
> > 	only ng_eiface and ng_wormhole can operate across a
> > 	boundary between vnets, so update those two accordingly.
> > 	The flag is then evaluated in netisr_dispatch(), and if
> > 	set, the mbuf is queued for later processing instead of
> > 	direct dispatching of netisr handler.
>
> Is it not possible for unix domain sockets to do so if the file
> descriptor as an a part of the filesystem that is shared?

As of now AF_LOCAL sockets in different vnets are hidden from each other 
using some existing jail magic infrastructure, so crossing a vnet 
boundary using AF_LOCAL sockets could be somewhat difficult now.  
However, in private communication several people have already expressed 
a wish to separate the AF_LOCAL virtualization from the rest of the 
networking subsystems, and I agree that this option should be provided 
soon, so this is in my todo pipeline.  In any case, AF_LOCAL sockets 
are not affected by this change if that was a part of your question, 
i.e. all AF_LOCAL communication will still be direct dispatched in 
netisr_dispatch()...

> I hope soon (within a year) to have several vimages from a
> networking perspective but with a common filesystem root.
> the processes will communicate between themselves using
> Unix domain sockets. (That's what they currently do but I want to
> make them have separate routing tables etc.

Yes that's why we do need separate virtualization of AF_LOCAL and other 
protocol familes...

Cheers,

Marko

> > Affected files ...
> >
> > .. //depot/projects/vimage/src/sys/net/netisr.c#6 edit
> > .. //depot/projects/vimage/src/sys/netgraph/ng_eiface.c#7 edit
> > .. //depot/projects/vimage/src/sys/netgraph/ng_wormhole.c#2 edit
> > .. //depot/projects/vimage/src/sys/sys/mbuf.h#6 edit
> >
> > Differences ...
> >
> > ==== //depot/projects/vimage/src/sys/net/netisr.c#6 (text+ko) ====
> >
> > @@ -178,8 +178,19 @@
> >  	 * from an interface but does not guarantee ordering
> >  	 * between multiple places in the system (e.g. IP
> >  	 * dispatched from interfaces vs. IP queued from IPSec).
> > +	 *
> > +	 * If the kernel was compiled with options VIMAGE, also defer
> > +	 * dispatch of netisr handlers for mbufs that have crossed a
> > +	 * boundary between two vnets.  Direct dispatching in such
> > +	 * cases could lead to various LORs, or in most extreme
> > +	 * circumstances cause the kernel stack to overflow.
> >  	 */
> > +#ifndef VIMAGE
> >  	if (netisr_direct && (ni->ni_flags & NETISR_MPSAFE)) {
> > +#else
> > +	if (netisr_direct && (ni->ni_flags & NETISR_MPSAFE) &&
> > +	    !(m->m_flags & M_REMOTE_VNET)) {
> > +#endif
> >  		isrstat.isrs_directed++;
> >  		/*
> >  		 * NB: We used to drain the queue before handling
> >
> > ==== //depot/projects/vimage/src/sys/netgraph/ng_eiface.c#7
> > (text+ko) ====
> >
> > @@ -253,6 +253,12 @@
> >  			continue;
> >  		}
> >
> > +#ifdef VIMAGE
> > +		/* Mark up the mbuf if crossing vnet boundary */
> > +		if (ifp->if_vnet != node->nd_vnet)
> > +			m->m_flags |= M_REMOTE_VNET;
> > +#endif
> > +
> >  		/*
> >  		 * Send packet; if hook is not connected, mbuf will get
> >  		 * freed.
> > @@ -542,6 +548,12 @@
> >  	/* Update interface stats */
> >  	ifp->if_ipackets++;
> >
> > +#ifdef VIMAGE
> > +	/* Mark up the mbuf if crossing vnet boundary */
> > +	if (ifp->if_vnet != hook->hk_node->nd_vnet)
> > +		m->m_flags |= M_REMOTE_VNET;
> > +#endif
> > +
> >  	(*ifp->if_input)(ifp, m);
> >
> >  	/* Done */
> >
> > ==== //depot/projects/vimage/src/sys/netgraph/ng_wormhole.c#2
> > (text+ko) ====
> >
> > @@ -378,11 +378,14 @@
> >          priv_p priv = NG_NODE_PRIVATE(NG_HOOK_NODE(hook));
> >  	int error = 0;
> >  	priv_p remote_priv = priv->remote_priv;
> > +	struct mbuf *m;
> >
> >  	if (priv->status != NG_WORMHOLE_ACTIVE) {
> >  		NG_FREE_ITEM(item);
> >  		error = ENOTCONN;
> >  	} else {
> > +		m = NGI_M(item);
> > +		m->m_flags |= M_REMOTE_VNET;
> >  		CURVNET_SET_QUIET(remote_priv->vnet);
> >                  NG_FWD_ITEM_HOOK(error, item, remote_priv->hook);
> >  		CURVNET_RESTORE();
> >
> > ==== //depot/projects/vimage/src/sys/sys/mbuf.h#6 (text+ko) ====
> >
> > @@ -192,6 +192,7 @@
> >  #define	M_LASTFRAG	0x2000	/* packet is last fragment */
> >  #define	M_VLANTAG	0x10000	/* ether_vtag is valid */
> >  #define	M_PROMISC	0x20000	/* packet was not for us */
> > +#define	M_REMOTE_VNET	0x40000	/* mbuf crossed boundary between two
> > vnets */
> >
> >  /*
> >   * External buffer types: identify ext_buf type.
> > @@ -214,7 +215,7 @@
> > 
> > #define	M_COPYFLAGS	(M_PKTHDR|M_EOR|M_RDONLY|M_PROTO1|M_PROTO1|M_PR
> >OTO2|\ M_PROTO3|M_PROTO4|M_PROTO5|M_SKIP_FIREWALL|\
> >  			    M_BCAST|M_MCAST|M_FRAG|M_FIRSTFRAG|M_LASTFRAG|\
> > -			    M_VLANTAG|M_PROMISC)
> > +			    M_VLANTAG|M_PROMISC|M_REMOTE_VNET)
> >
> >  /*
> >   * Flags to purge when crossing layers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200708152044.38580.zec>