From owner-freebsd-current@freebsd.org Thu Feb 8 08:31:33 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 59AFFF0E741 for ; Thu, 8 Feb 2018 08:31:33 +0000 (UTC) (envelope-from ohartmann@walstatt.org) Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B940C7A8EA for ; Thu, 8 Feb 2018 08:31:29 +0000 (UTC) (envelope-from ohartmann@walstatt.org) Received: from freyja.zeit4.iv.bundesimmobilien.de ([87.138.105.249]) by mail.gmx.com (mrgmx002 [212.227.17.190]) with ESMTPSA (Nemesis) id 0MA8hF-1eZ0NM2eEr-00BO1c; Thu, 08 Feb 2018 09:31:21 +0100 Date: Thu, 8 Feb 2018 09:31:15 +0100 From: "O. Hartmann" To: freebsd-jails@freebsd.org, freebsd-current Subject: VIMAGE: vnet, epair and lots of jails on bridgeX - routing Message-ID: <20180208093052.7f5d7a98@freyja.zeit4.iv.bundesimmobilien.de> Organization: Walstatt MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:OUWcXoiXsA4ganD2clqBdMmWm6X8JceTeupdg6AH/f7gwYf9zew /Esqgrk3muoJNkioHhCG5uEQRVnKHrnSTiE0AKhYV9NS9w52PNZeFq872dPxd3DqGoGyeQS ympYRzR3PdJ7cMf4CWYF13tkvrKURzTL9W6OvWrHy6EgeQ/3CnOZVSjtVER8CD2cquJCjwx 8GaYgZR7e7VT50xCqH/vw== X-UI-Out-Filterresults: notjunk:1;V01:K0:MN1Q0iqUwWU=:Ev1QrE9CBbvVTZVjCZxnWm jjgTD23H5RlEfNt9SbwXsPtRKrvfYj0jNnZF38Wnr2le9h6+pNfODXaWRT+w97MwhEe1MCr8E kQAQ2HxjZ6UQnuUkAM/WbC5fxIubR/B3Luu8KzSHbX6F2ZZa6unYA2uSaSokaBa7VyYnxKaaC 0FfFB1CL9BEmGJxTkviCdri8JyPrnBPuxI2RG2O9zBUGhcMdxwu+dG8dBEbQot6EMAx2g1tzI Jx2JdkeugOTRBMyq0mu5Kwpc5nvdKCdkqnsLlRab8OUxw5Zh/CyF+velbWLCbdLHxbAk1hvPc VU71bxmBSuwSoqoe18OjsPOlIkDskZ8/oSgHUcvdJ7bbMBbwaWu6VOOoZ5nxPVNvEVTZ8IZPs OwVPmLdlO4PRFMeNlHUqg0HWShgsIHYzvufz8aOdNFttKq06nDc5ZJCmADIzy4ApL1MSQ0gS1 fC54C9ZOANYK86B+HHHDCML3TqyWggL8aPYmlFDwmkUzTq9nQ2BteidY04+tZ1rUiwBftpRrQ cqwjIY831RQEpBpJ6c0wTwtq3rj9ToO1SW3lpIt2ca3HcfG+O9lvsSX+aLM1yWkyV+csOHKfk LZJN9clKDaZTT7/pe9KE0Xdvs9lyF0Z+m9XZJYH4rp/1RvKEFdE90OeFgoGReCBE3e8WNbS+d pSSqnj7+pFVkFaDcNU0yuTjfCPowD6eYnpdWZFx9t7CTS0VdOWajb8QfO9WPxPqJ4dhQPBBxB BmQ/YYESs5CImA30U6xkXU1DnDPJw366/mtO+jWbm7ay3Lu6Tjfa65ek6SF6BHarzfq9jv0pw X/52tcySp/U9P6ZPa/s6vdZBmdJ5LjxY3vw5m/EXIqEYZdt99s= X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Feb 2018 08:31:33 -0000 Hello, I fight with the following problem without any kind of success and I need some help and/or advice. We are running several CURRENT and 11.1-RELENG-p6 boxes. CURRENT is at the most recent version as of today. VIMAGE is compiled in into all kernels. IPFW is compiled into all kernels and is the one and only firewall used. On CURRENT, the host's ipfw is set to "OPEN" (using the rc.-scripts so far). By convention, I address the host running the kernel by "host". Every jail is created/configured with its own "vnet" cloned network stack (vnet=new). All hosts do have at least three physical NICs. The host itself is supposed to be member of the "friendly" network via a dedicated NIC. The two remaining NICs are split into fractions belonging to an "hostile" network on which I'd like to place exposed jails (for now), and to the "friendly" network, on which also jails will be hosted, but via a dedicated NIC. Inbetween those two networks, the host will have a third, intermediate, network, call it the "service" network. The following will be true for ALL jails created, including the host itself: net.link.bridge.pfil_member=0 net.link.bridge.pfil_bridge=0 net.link.bridge.pfil_onlyip=0 First, I clone/create three bridge(4) devices, bridge0 (considered to be the "glue" between the "service" jails), bridge1 (considered to be the glue between the jails on the friednly network side) and bridge2, which is the glue between the jails on the hostile side. bridge1 has eth1 as a member, which provides the physical access to the friendly network, eth2 is member of bridge2, which provides access to the hostile network. By convention, when creating epair(4), the a-portion belongs to the jail itself and is assigned with an IPv6 address. The b-portion of the epair(4) is member of its bridge according to its realm (friendly, service or hostile network). Additionally, there is a special jail, the router, which has three epair(4) devices, the b-portion of the epair is member of the appropriate bridge(4) and this router jail has static routes assigned, pointing to the appropriate epairXXXa that is suppoesd to be the link into the correct bridge/network. IPFW is set to open on this jail (for now). On this special jail it is set: net.inet.ip.forwarding=1. I hope, the topology is clear so far. All epairs or epair endpoints as well as the bridges are UP! Double checked this. Jails on bridge0 (service net) have IPs in the range 10.10.0.0/24, the b-portion of the routing jail's epair is member of bridge0, as described above, and the a-portion of the epair has IP 10.10.0.1. Default route on each jeail on bridge0 is set to 10.10.0.1 accordingly. Consider a similar setup on the other jails on the friendly and hostile network, except the fact that their bridges do have a physical NIC to which they may have access to a real network. The setup might not be ideal and/or applicable for the purpose of separartion of networks virtually, but that shouldn't be the subject here. More important is that I assume that I haven't understood some essentials, because the setup doens't work as expected. Furthermore, it behaves on FreeBSD 11.1-RELENG-p6 sometimes completely unpredictable - but in that special case, I think I ran IPFW on the host as "WORKSTATION" and dynamic rules may play an important role here. But focussing on the CURRENT box, the host's IPFW is set to OPEN. With jexec -l hostA I gain access to host A on the "service" bridge0 and I want to ping its neighbour, hostB, on the same bridge and in the same net. It doesn't work! From the routing jail, I CAN NOT ping any host on bridge0. The routing jail has these network settings: [... routing jail ...] lo0: flags=8049 metric 0 mtu 16384 options=600003 inet 127.0.0.1 netmask 0xff000000 groups: lo [epair to bridge0 - service net] epair4000a: flags=8843 metric 0 mtu 1500 options=8 ether 02:57:d0:00:07:0a inet 10.10.0.1 netmask 0xffffff00 broadcast 10.10.0.255 media: Ethernet 10Gbase-T (10Gbase-T ) status: active groups: epair [epair to bridge1, friendly net] epair4001a: flags=8843 metric 0 mtu 1500 options=8 ether 02:57:d0:00:09:0a inet 192.168.11.1 netmask 0xffffff00 broadcast 192.168.11.255 media: Ethernet 10Gbase-T (10Gbase-T ) status: active groups: epair [epair to bridge2, hostile net] epair4002a: flags=8843 metric 0 mtu 1500 options=8 ether 02:57:d0:00:0b:0a inet 10.10.10.1 netmask 0xfffffc00 broadcast 10.10.10.255 media: Ethernet 10Gbase-T (10Gbase-T ) status: active groups: epair routing: netstat -Warn Routing tables Internet: Destination Gateway Flags Use Mtu Netif Expire 10.10.0.0/24 link#2 U 11 1500 epair4000a 10.10.0.1 link#2 UHS 4 16384 lo0 10.10.10.0/24 link#4 U 210 1500 epair4002a 10.10.10.1 link#4 UHS 44 16384 lo0 127.0.0.1 link#1 UH 0 16384 lo0 192.168.11.0/24 link#3 U 9 1500 epair4001a 192.168.11.1 link#3 UHS 0 16384 lo0 Consider a jail hostCC on bridge2 in the hostile network, IP 10.10.10.128. I can ping that jail, although it has conceptionally the very same setup as the unreachable jails on bridge0! It is weird. On bridge0, no jail can be pinged, it looks like the ethernet is somehwo down on that bridge. I would expect to ping each host member of the very same bridge! On 11.1-RELENG-p6, there are other weird issues, I was able to ping those jails, even ssh to them, but that vanished after several restarts of the jails system (each bridge, epair is created by jail.conf and destroyed after the jails has been deactivated and doing so a considerable amount brings down the FreeBSD 11.1-RELENG-p6 host verys successfully - it crashes!). So, since VIMAGE is now default in CURRENT's GENERIC, I consider its functionality at least "predictable", but I fail somehow here. Does someone have a deeper insight or realise the mistake I'm celebrating here? Thanks in adavnce, Oliver