From owner-freebsd-net@freebsd.org Fri Dec 20 10:19:29 2019 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id EBEFB1D52B2 for ; Fri, 20 Dec 2019 10:19:29 +0000 (UTC) (envelope-from hausen@punkt.de) Received: from kagate.punkt.de (kagate.punkt.de [217.29.33.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47fPqh52dPz3JGL for ; Fri, 20 Dec 2019 10:19:28 +0000 (UTC) (envelope-from hausen@punkt.de) Received: from hugo10.ka.punkt.de (hugo10.ka.punkt.de [217.29.44.10]) by gate1.intern.punkt.de with ESMTP id xBKAJO9R051980; Fri, 20 Dec 2019 11:19:24 +0100 (CET) Received: from [217.29.44.222] ([217.29.44.222]) by hugo10.ka.punkt.de (8.14.2/8.14.2) with ESMTP id xBKAJOTx016153; Fri, 20 Dec 2019 11:19:24 +0100 (CET) (envelope-from hausen@punkt.de) From: "Patrick M. Hausen" Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Continuing problems in a bridged VNET setup Message-Id: Date: Fri, 20 Dec 2019 11:19:24 +0100 Cc: Kristof Provost To: freebsd-net@freebsd.org X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: 47fPqh52dPz3JGL X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of hausen@punkt.de designates 217.29.33.131 as permitted sender) smtp.mailfrom=hausen@punkt.de X-Spamd-Result: default: False [-2.18 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:217.29.32.0/20:c]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[punkt.de]; NEURAL_HAM_LONG(-1.00)[-0.999,0]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[131.33.29.217.list.dnswl.org : 127.0.10.0]; IP_SCORE(-0.38)[ip: (-0.36), ipnet: 217.29.32.0/20(-0.86), asn: 16188(-0.67), country: DE(-0.02)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:16188, ipnet:217.29.32.0/20, country:DE]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Dec 2019 10:19:30 -0000 Hi all, we still experience occasional network outages in production, yet have not been able to find the root cause. We run around 50 servers with VNET jails. some of them with a handful, the busiest ones with 50 or more jails each. Every now and then the jails are not reachable over the net, anymore. The server itself is up and running, all jails are up and running, one can ssh to the server but none of the jails can communicate over the network. There seems to be no pattern to the time of occurrance except that more jails on one system make it "more likely". Also having more than one bridge, e.g. for private networks between jails seems to increase the probability. When a server shows the problem it tends to get into the state rather frequently, a couple of hours inbetween. Then again most servers run for weeks without exhibiting the problem. That's what makes it so hard to reproduce. The last couple of days one system was failing regularly until we reduced the number of jails from around 80 to around 50. Now it seems stable again. I have a test system with lots of jails that I work with gatling that did not show a single failure so far :-( Setup: All jails are iocage jails with VNET interfaces. They are connected to at least one bridge that starts with the physical external interface as a member and gets jails' epair interfaces added as they start up. All jails are managed by iocage. ifconfig_igb0=3D"-rxcsum -rxcsum6 -txcsum -txcsum6 -vlanhwtag -vlanhwtso = up" cloned_interfaces=3D"bridge0" ifconfig_bridge0_name=3D"inet0" ifconfig_inet0=3D"addm igb0 up" ifconfig_inet0_ipv6=3D"inet6 /64 auto_linklocal" $ iocage get interfaces vpro0087 vnet0:inet0 $ ifconfig inet0 inet0: flags=3D8843 metric 0 mtu = 1500 ether 90:1b:0e:63:ef:51 inet6 fe80::921b:eff:fe63:ef51%inet0 prefixlen 64 scopeid 0x4 inet6 prefixlen 64 nd6 options=3D21 groups: bridge id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: vnet0.4 flags=3D143 ifmaxaddr 0 port 7 priority 128 path cost 2000 member: vnet0.1 flags=3D143 ifmaxaddr 0 port 6 priority 128 path cost 2000 member: igb0 flags=3D143 ifmaxaddr 0 port 1 priority 128 path cost 2000000 What we tried: At first we suspected the bridge to become "wedged" somehow. This was corroborated by talking to various people at devsummits and EuroBSDCon with Kristof Provost specifically suggesting that if_bridge was still under giant lock and there might be a problem here that the lock = is not released under some race condition and then the entire bridge = subsystem would be stalled. That sounds plausible given the random occurrance. But I think we can rule out that one, because: - ifconfig up/down does not help - the host is still communicating fine over the same bridge interface - tearing down the bridge, kldunload (!) of if_bridge.ko followed by a new kldload and reconstructing the members with `ifconfig addm` does not help, either - only a host reboot restores function Finally I created a not iocage managed jail on the problem host. Please ignore the `iocage` in the path, I used it to populate the root directory. But it is not started by iocage at boot time and the manual config is this: testjail { host.hostname =3D "testjail"; # hostname path =3D "/iocage/jails/testjail/root"; # root directory exec.clean; exec.system_user =3D "root"; exec.jail_user =3D "root"; vnet;=20 vnet.interface =3D "epair999b"; exec.prestart +=3D "ifconfig epair999 create; ifconfig epair999a = inet6 2A00:B580:8000:8000::1/64 auto_linklocal"; exec.poststop +=3D "sleep 2; ifconfig epair999a destroy; sleep = 2"; =20 # Standard stuff exec.start +=3D "/bin/sh /etc/rc"; exec.stop =3D "/bin/sh /etc/rc.shutdown"; exec.consolelog =3D "/var/log/jail_testjail_console.log"; mount.devfs; #mount devfs allow.raw_sockets; #allow ping-pong devfs_ruleset=3D"4"; #devfs ruleset for this jail } $ cat /iocage/jails/testjail/root/etc/rc.conf hostname=3D"testjail" ifconfig_epair999b_ipv6=3D"inet6 2A00:B580:8000:8000::2/64 = auto_linklocal" When I do `service jail onestart testjail` I can then ping6 the jail = from the host and the host from the jail. As you can see the if_bridge is not involved in this traffic. When the host is in the wedged state and I start this testjail the same way, no communication across the epair interface is possible. To me this seems to indicate that not the bridge but all epair = interfaces stop working at the very same time. OS is RELENG_11_3, hardware and specifically network adapters vary, we = have igb, ix, ixl, bnxt ... Does anyone have a suggestion what diagnostic measures could help to = pinpoint the culprit? The random occurrance and the fact that the problem seems = to prefer the production environment only makes this a real pain ... Thanks and kind regards, Patrick --=20 punkt.de GmbH Patrick M. Hausen .infrastructure Kaiserallee 13a 76133 Karlsruhe Tel. +49 721 9109500 https://infrastructure.punkt.de info@punkt.de AG Mannheim 108285 Gesch=C3=A4ftsf=C3=BChrer: J=C3=BCrgen Egeling, Daniel Lienert, Fabian = Stein