From nobody Sat Mar 12 00:20:36 2022 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 2A3FF1A0E57E for ; Sat, 12 Mar 2022 00:20:42 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KFk3Z0jZ0z4bXT; Sat, 12 Mar 2022 00:20:42 +0000 (UTC) (envelope-from kp@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1647044442; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ARttu+vwrgideKR/RgvPySEZzeia1X7/Ex0qZrq1D6Q=; b=VwbA78GGn1d4IAkd/09QuPPxnHaORcX6zOFQUZXjYEEQOCdi7+KDwT4wx1XZqzfeDdV9+w cwkVKlyAdDHP1Aeei7WpL8FgpoIdyS684VOD+W6FlxIoBrL7aIJPObQLa5vgiY8qhnipQ2 3491zF/NyLUGTLW9uIQKHHigZieNDYrZvE/wj64cNBVC5GS9qyhYrrtzdD3Mmch/JuPrzB +Ggz3W04cLKvQoO+X/MpsgZ7wGc6BNuhwQIheD14G+pZVzTkFg50A0pEBU5nefNR3hNdgS k85KtC8n3gSw6ZN8usO4EqvzbqOMzyeJMI0OqLeKkr1l9WjXBSiwRQBxA3BwMA== Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.codepro.be", Issuer "R3" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id D3581CD2; Sat, 12 Mar 2022 00:20:41 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id 0DB2D18C05; Sat, 12 Mar 2022 01:20:38 +0100 (CET) From: Kristof Provost To: Johan Hendriks Cc: freebsd-net@FreeBSD.org, """ >> \\\\\\\\Patrick M. Hausen\\\\" Subject: Re: epair and vnet jail loose connection. Date: Fri, 11 Mar 2022 18:20:36 -0600 X-Mailer: MailMate (1.14r5852) Message-ID: <41ED1534-5E98-4D46-A562-811E80F82C5F@FreeBSD.org> In-Reply-To: References: <051d51b6-2a07-fbc6-7b4d-13947e7fcdbb@gmail.com> List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1647044442; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ARttu+vwrgideKR/RgvPySEZzeia1X7/Ex0qZrq1D6Q=; b=oEdE8D8BMxTxaqsxhjA5QY3rwmvmZyuhCufFbg4JUTGziNMu3E7gsL06dVguupIwhNK+r2 DKGNbCVWJYlllF/ngbEFvJbpX5My9JEBz0ajCLvU5jeNU2v/kHl5GnFmeRchRlfaRlxr8/ /Hd7O2MGwN0ievozM9EVmAX10in/FS1yKpSuiNmj5gsa94JysbgPGI+/UvckUcmcm+M6HZ bePek7J49kmfIN+kfsq5jYWUi5x9pPvSUy79QIpXHColjMz8e4I9KSGSlOoymSjTU0qeRJ C8vWSGDKQ7mMPLY7xd68Bbn4uY/tWIG3vEAn+uSM2YkAvUPWXm2/WPYQxjH26A== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1647044442; a=rsa-sha256; cv=none; b=AeUQzwDnL1aJu/c+Q8EHerdfiz4VkZsb6lrTlvBx7rYl5ouFUSE4Eu+dYmJGNqU2v7bpva f+1KTIbY/8+Y7gAjJs45ZT5upjNd7MVHyDSQ2hd24Lk5H7iV5ySVe2lehZtZgEfLzYEs2A TTUMbZmUdvf8a5CaiimZ7gKoLT174qE+n7CyumuHx5xFgYeMQdwNMD5zrLl9JunlQQhYSO oCAWvoJGNHZuJSw20BltkRzMHGDYxSW/OzS8G/rziqb+0q3DAvRAl6IEe707HR9NK/LEM2 zYeV0qHwpkwOFxP8vYs+6/KAZ9SuW9N/kebSnC8D9SwUGEiwejuxrZRDBU3a9w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N On 11 Mar 2022, at 17:44, Johan Hendriks wrote: > On 09/03/2022 20:55, Johan Hendriks wrote: >> The problem: >> I have a FreeBSD 14 machine and a FreeBSD 13-stable machine, both runn= ing the same jails just to test the workings. >> >> The jails that are running are a salt master, a haproxy jail, 2 webse= rvers, 2 varnish servers, 2 php jails one for php8.0 and one with 8.1. Al= l the jails are connected to bridge0 and all the jails use vnet. >> >> I believe this worked on an older 14-HEAD machine, but i did not do a = lot with it back then, and when i started testing again and after updatin= g the OS i noticed that one of the varnish jails lost it's network connec= tion after running for a few hours. I thought it was just something on HE= AD so never really looked at it. But later on when i start using the jail= s again and testing a test wordpress site i noticed that with a simple lo= ad test my haproxy jail within one minute looses it's network connection.= I see nothing in the logs, on the host and on the jail. >> From the jail i can not ping the other jails or the IP adres of the b= ridge. I can however ping the jails own IP adres. From the host i can als= o not ping the haproxy jail IP adres. If i start a tcpdump on the epaira = interface from the haproxy jail i do see the packets arrive but not in th= e jail. >> >> I used ZFS to send all the jails to a 13-STABLE machine and copied ove= r the jail.conf file as well as the pf.conf file and i saw the same behav= ior. >> >> Then i tried to use 13.0-RELEASE-p7 and on that machine i do not see t= his happening. There i can stress test the machine for 10 minutes without= a problem but on 14-HEAD and 13-STABLE within a minute the jail's networ= k connection fails and only a restart of the jail brings it back online t= o exhibit the same behavior if i start a simple load test which it should= handle nicely. >> >> One of the jail hosts is running under VMWARE and the other is running= under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is running under Ub= untu with KVM >> >> Thank you for your time. >> regards >> Johan >> > I did some bisecting and the latest commit that works on FreeBSD 13-Sta= ble is 009a56b2e > Then the commit 2e0bee4c7=C2=A0 if_epair: implement fanout and above is= showing the symptoms described above. > Interestingly I cannot reproduce stalls in simple epair setups. It would be useful if you could reduce the setup with the problem into a = minimal configuration so we can figure out what other factors are involve= d. Kristof