From owner-p4-projects@FreeBSD.ORG Wed Aug 15 18:45:09 2007 Return-Path: Delivered-To: p4-projects@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 32767) id 88EDB16A421; Wed, 15 Aug 2007 18:45:09 +0000 (UTC) Delivered-To: perforce@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 42BD116A417 for ; Wed, 15 Aug 2007 18:45:09 +0000 (UTC) (envelope-from zec@icir.org) Received: from mail.srv.carnet.hr (unknown [IPv6:2001:b68:e160:0:211:43ff:fecd:6374]) by mx1.freebsd.org (Postfix) with ESMTP id A759C13C481 for ; Wed, 15 Aug 2007 18:45:07 +0000 (UTC) (envelope-from zec@icir.org) Received: from vipnet26-165.mobile.carnet.hr ([193.198.165.26]:58086) by mail.srv.carnet.hr with esmtp (Exim 4.50) id 1ILNrX-0001th-FV; Wed, 15 Aug 2007 20:45:03 +0200 From: Marko Zec To: Julian Elischer Date: Wed, 15 Aug 2007 20:44:38 +0200 User-Agent: KMail/1.9.4 References: <200708151129.l7FBTiaU005370@repoman.freebsd.org> <46C320B4.7070008@elischer.org> In-Reply-To: <46C320B4.7070008@elischer.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200708152044.38580.zec@icir.org> X-SA-Exim-Connect-IP: 193.198.165.26 X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-26) on nihal.carnet.hr X-Spam-Level: X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=ham version=3.1.4 X-SA-Exim-Version: 4.2 (built Thu, 03 Mar 2005 10:44:12 +0100) Cc: Perforce Change Reviews Subject: Re: PERFORCE change 125169 for review X-BeenThere: p4-projects@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: p4 projects tree changes List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2007 18:45:09 -0000 On Wednesday 15 August 2007 17:50, Julian Elischer wrote: > Marko Zec wrote: > > http://perforce.freebsd.org/chv.cgi?CH=125169 > > > > Change 125169 by zec@zec_tpx32 on 2007/08/15 11:28:57 > > > > Defer dispatching of netisr handlers for mbufs which have > > crossed a boundary between two vnets. Direct dispatching > > in such cases could lead to various LORs, or in most > > extreme circumstances cause the kernel stack to overflow. > > > > This is accomplished by the introduction of a new mbuf > > flag, M_REMOTE_VNET, which must be set by any kernel entity > > moving a mbuf from one vnet context to another. So far > > only ng_eiface and ng_wormhole can operate across a > > boundary between vnets, so update those two accordingly. > > The flag is then evaluated in netisr_dispatch(), and if > > set, the mbuf is queued for later processing instead of > > direct dispatching of netisr handler. > > Is it not possible for unix domain sockets to do so if the file > descriptor as an a part of the filesystem that is shared? As of now AF_LOCAL sockets in different vnets are hidden from each other using some existing jail magic infrastructure, so crossing a vnet boundary using AF_LOCAL sockets could be somewhat difficult now. However, in private communication several people have already expressed a wish to separate the AF_LOCAL virtualization from the rest of the networking subsystems, and I agree that this option should be provided soon, so this is in my todo pipeline. In any case, AF_LOCAL sockets are not affected by this change if that was a part of your question, i.e. all AF_LOCAL communication will still be direct dispatched in netisr_dispatch()... > I hope soon (within a year) to have several vimages from a > networking perspective but with a common filesystem root. > the processes will communicate between themselves using > Unix domain sockets. (That's what they currently do but I want to > make them have separate routing tables etc. Yes that's why we do need separate virtualization of AF_LOCAL and other protocol familes... Cheers, Marko > > Affected files ... > > > > .. //depot/projects/vimage/src/sys/net/netisr.c#6 edit > > .. //depot/projects/vimage/src/sys/netgraph/ng_eiface.c#7 edit > > .. //depot/projects/vimage/src/sys/netgraph/ng_wormhole.c#2 edit > > .. //depot/projects/vimage/src/sys/sys/mbuf.h#6 edit > > > > Differences ... > > > > ==== //depot/projects/vimage/src/sys/net/netisr.c#6 (text+ko) ==== > > > > @@ -178,8 +178,19 @@ > > * from an interface but does not guarantee ordering > > * between multiple places in the system (e.g. IP > > * dispatched from interfaces vs. IP queued from IPSec). > > + * > > + * If the kernel was compiled with options VIMAGE, also defer > > + * dispatch of netisr handlers for mbufs that have crossed a > > + * boundary between two vnets. Direct dispatching in such > > + * cases could lead to various LORs, or in most extreme > > + * circumstances cause the kernel stack to overflow. > > */ > > +#ifndef VIMAGE > > if (netisr_direct && (ni->ni_flags & NETISR_MPSAFE)) { > > +#else > > + if (netisr_direct && (ni->ni_flags & NETISR_MPSAFE) && > > + !(m->m_flags & M_REMOTE_VNET)) { > > +#endif > > isrstat.isrs_directed++; > > /* > > * NB: We used to drain the queue before handling > > > > ==== //depot/projects/vimage/src/sys/netgraph/ng_eiface.c#7 > > (text+ko) ==== > > > > @@ -253,6 +253,12 @@ > > continue; > > } > > > > +#ifdef VIMAGE > > + /* Mark up the mbuf if crossing vnet boundary */ > > + if (ifp->if_vnet != node->nd_vnet) > > + m->m_flags |= M_REMOTE_VNET; > > +#endif > > + > > /* > > * Send packet; if hook is not connected, mbuf will get > > * freed. > > @@ -542,6 +548,12 @@ > > /* Update interface stats */ > > ifp->if_ipackets++; > > > > +#ifdef VIMAGE > > + /* Mark up the mbuf if crossing vnet boundary */ > > + if (ifp->if_vnet != hook->hk_node->nd_vnet) > > + m->m_flags |= M_REMOTE_VNET; > > +#endif > > + > > (*ifp->if_input)(ifp, m); > > > > /* Done */ > > > > ==== //depot/projects/vimage/src/sys/netgraph/ng_wormhole.c#2 > > (text+ko) ==== > > > > @@ -378,11 +378,14 @@ > > priv_p priv = NG_NODE_PRIVATE(NG_HOOK_NODE(hook)); > > int error = 0; > > priv_p remote_priv = priv->remote_priv; > > + struct mbuf *m; > > > > if (priv->status != NG_WORMHOLE_ACTIVE) { > > NG_FREE_ITEM(item); > > error = ENOTCONN; > > } else { > > + m = NGI_M(item); > > + m->m_flags |= M_REMOTE_VNET; > > CURVNET_SET_QUIET(remote_priv->vnet); > > NG_FWD_ITEM_HOOK(error, item, remote_priv->hook); > > CURVNET_RESTORE(); > > > > ==== //depot/projects/vimage/src/sys/sys/mbuf.h#6 (text+ko) ==== > > > > @@ -192,6 +192,7 @@ > > #define M_LASTFRAG 0x2000 /* packet is last fragment */ > > #define M_VLANTAG 0x10000 /* ether_vtag is valid */ > > #define M_PROMISC 0x20000 /* packet was not for us */ > > +#define M_REMOTE_VNET 0x40000 /* mbuf crossed boundary between two > > vnets */ > > > > /* > > * External buffer types: identify ext_buf type. > > @@ -214,7 +215,7 @@ > > > > #define M_COPYFLAGS (M_PKTHDR|M_EOR|M_RDONLY|M_PROTO1|M_PROTO1|M_PR > >OTO2|\ M_PROTO3|M_PROTO4|M_PROTO5|M_SKIP_FIREWALL|\ > > M_BCAST|M_MCAST|M_FRAG|M_FIRSTFRAG|M_LASTFRAG|\ > > - M_VLANTAG|M_PROMISC) > > + M_VLANTAG|M_PROMISC|M_REMOTE_VNET) > > > > /* > > * Flags to purge when crossing layers.