From nobody Tue Apr 14 17:45:53 2026 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4fwBXH2stnz6ZBWR; Tue, 14 Apr 2026 17:46:07 +0000 (UTC) (envelope-from zlei@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R13" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4fwBXH1Dj1z3gs8; Tue, 14 Apr 2026 17:46:07 +0000 (UTC) (envelope-from zlei@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1776188767; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=mlfS1GeaMOfLVKZNwE8/94Xf6fhsf6s4RKmF5ExZXzI=; b=HRPkL3W932B6cHkOccIEY5VlMRyQyTbU4wAI8o0ix9m+HoG/O7lh3MBGs+2fPaBNpvujBG 9o3SuTqiNlRKhhZyTeyYxRW4Nu9VGuHoXBcvxb8W/ChVPePIHYT8YrBD+VKq9Io0eurBe6 7Ik2doPhgIPFn3kjpinWTYcQHoF+ndcr9sdO9VekHF2LfBGqEy7ZSaVRSesJ9XWyxS+BUk l6lVzBym8Hzl+qEQym0b7RCCNrZweIuGmeJQ9g75wjc56gUmwFViyW4ijRs6ySVQ5wTub/ O8gzf3JvIaSVlwSfnk8X6/eupwzwOBaWZUQMcpo6rkS369/wsMMQqWTZbvLs4A== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1776188767; a=rsa-sha256; cv=none; b=ZOPyXLzavTJgRTTxxZR1/6Jf7W0lVWrnfLqXAcUS0VXk1zoYH4DQfABe0PMqI6ndOOGday a/Bkis88KuCmUu2FTdPWlYx7Lar10WQL9dHbaTw/FJD9W7U3xE5QHME5flrxWUxJseb2LO noo1qEWqvbDnGue3TpvPvDm1LDqAHJcIdPUAPmTTiEUKLomhh0DnWZG6OJ8QD+Z80Jjku5 biuj1+zvPb5IsPZ/qO9Lztv+kOQs1g4E1NRgoWj1PcWXB0lLchkNqyP+4C6PxRoPrIr/zt 4NnZSGY3z8xgN8V8xf1eG6LJWmbn+OlzR2o3gX2cKajpI2CrDq8kuWddiLsaBA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1776188767; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=mlfS1GeaMOfLVKZNwE8/94Xf6fhsf6s4RKmF5ExZXzI=; b=Cx3mCyFqnsTNPtZ2QdurULOyLjuf2cwgBgXZg2XyJ0lrd5ZV6bM/MgQXjhxUMHlTcSaZyw JpYKyothQHouO11Rqo1lfoB3LcxsL+gs36HZ+HV+10uNymDAezEraca3xuHxlHZ8LZEZb3 TWQR7IqnuRCymxTc5W2h/tGohztrZvV0ZKUwZjc1jVQPYNF1gXqrYpjQROjhP5qlQa66bx kwcYAha+pYq7kwoM0UUJt4mbL8u2slQmXYHhxUHaMA/r4tkP3XBNSvg6mlfPwW+vKxWJn5 DFase99xO2jR8biLWR85hIPh0lWF1s1hIHcGQADPwcRkxJITHepNFkTc6Dd0uA== Received: from smtpclient.apple (ns1.oxydns.net [45.32.91.63]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: zlei/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4fwBXD26XTz66K; Tue, 14 Apr 2026 17:46:03 +0000 (UTC) (envelope-from zlei@FreeBSD.org) From: Zhenlei Huang Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_34CB54E6-A347-4C15-86A2-3F0EA7E1F9D0" List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.10\)) Subject: Re: git: e9fc0c538264 - main - if_clone: Make ifnet_detach_sxlock opaque to consumers Date: Wed, 15 Apr 2026 01:45:53 +0800 In-Reply-To: Cc: "src-committers@freebsd.org" , "dev-commits-src-all@freebsd.org" , "dev-commits-src-main@freebsd.org" To: Kristof Provost References: <69dc7394.203c3.3a41c692@gitrepo.freebsd.org> X-Mailer: Apple Mail (2.3696.120.41.1.10) --Apple-Mail=_34CB54E6-A347-4C15-86A2-3F0EA7E1F9D0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Apr 14, 2026, at 11:57 PM, Kristof Provost wrote: >=20 > On 13 Apr 2026, at 6:39, Zhenlei Huang wrote: >=20 > The branch main has been updated by zlei: >=20 > URL: = https://cgit.FreeBSD.org/src/commit/?id=3De9fc0c538264355bd3fd9120c6500782= 81c2a290 = > commit e9fc0c538264355bd3fd9120c650078281c2a290 > Author: Zhenlei Huang zlei@FreeBSD.org > AuthorDate: 2026-04-13 04:38:44 +0000 > Commit: Zhenlei Huang zlei@FreeBSD.org > CommitDate: 2026-04-13 04:38:44 +0000 >=20 > if_clone: Make ifnet_detach_sxlock opaque to consumers >=20 > The change e133271fc1b5e introduced ifnet_detach_sxlock, and change > 6d2a10d96fb5 widened its coverage, but there are still consumers, > net80211 and tuntap e.g., want it. Instead of sprinkling it = everywhere, > make it opaque to consumers. >=20 > Out of tree drivers shall also benefit from this change. >=20 > Reviewed by: kp > MFC after: 2 weeks > Differential Revision: https://reviews.freebsd.org/D56298 > I don=E2=80=99t really understand why, but this commit causes bricoler = test runs to fail for me. > Or more precisely, it looks like parallel test runs see a number of = failures in the pf tests. >=20 > Reverting only part of this change, specifically this: >=20 > diff --git a/sys/net/if_clone.c b/sys/net/if_clone.c > index db3db78c1b76..d0fe54b6146b 100644 > --- a/sys/net/if_clone.c > +++ b/sys/net/if_clone.c > @@ -683,11 +683,12 @@ if_clone_detach(struct if_clone *ifc) > V_if_cloners_count--; > IF_CLONERS_UNLOCK(); >=20 > - sx_xlock(&ifnet_detach_sxlock); > /* destroy all interfaces for this cloner */ > - while (!LIST_EMPTY(&ifc->ifc_iflist)) > + while (!LIST_EMPTY(&ifc->ifc_iflist)) { > + sx_xlock(&ifnet_detach_sxlock); > if_clone_destroyif_flags(ifc, = LIST_FIRST(&ifc->ifc_iflist), IFC_F_FORCE); > - sx_xunlock(&ifnet_detach_sxlock); > + sx_xunlock(&ifnet_detach_sxlock); > + } >=20 > IF_CLONE_REMREF(ifc); > } > Seems to be enough to avoid the problem. So it looks like somehow = deleting all interfaces for a given cloner under that lock (rather than = releasing it and re-acquiring it) causes the test failures. >=20 >=20 That is interesting. I do not expect a re-acquiring will make any = differences, given ifc_detach_cloner() and / or if_clone_detach() are = basically be invoked by VNET_SYSUNINIT() and vnet_destroy() has = acquired ifnet_detach_sxlock ( I planed to removed the acquiring of it = in vnet_destroy() but not yet done ). So I'd expect an assert = `sx_assert(&ifnet_detach_sxlock, SA_XLOCKED)` in if_clone_detach() will = mostly succeed. Some drivers, for example wg(4), iterate all VNETs and invoke = ifc_detach_cloner() on module unload. That should be the only case that = `ifnet_detach_sxlock` is not recursively acquired, and not acquired = prior to the commit. > The failing tests I see are: >=20 > sys/netpfil/pf/divert-to:in_div -> failed: Test case body returned a = non-ok exit code, but this is not allowed [36.591s] > sys/netpfil/pf/divert-to:in_div_in -> failed: atf-check failed; see = the output of the test for details [30.867s] > sys/netpfil/pf/divert-to:in_div_in_fwd_out_div_out -> failed: = atf-check failed; see the output of the test for details [39.705s] > sys/netpfil/pf/divert-to:in_dn_in_div_in_out_div_out_dn_out -> = failed: atf-check failed; see the output of the test for details = [45.059s] > sys/netpfil/pf/divert-to:out_div -> failed: Test case body returned = a non-ok exit code, but this is not allowed [23.629s] > sys/netpfil/pf/killstate:gateway -> failed: Killing with a different = gateway removed the state. [32.803s] > sys/netpfil/pf/killstate:id -> failed: Killing a different ID = removed the state. [32.789s] > sys/netpfil/pf/killstate:label -> failed: Killing a non-existing = label removed the state. [47.749s] > sys/netpfil/pf/killstate:multilabel -> failed: Setting new rules = removed the state. [43.570s] > sys/netpfil/pf/killstate:src_dst -> failed: Killing with the wrong = destination IP removed our state. [57.819s] > sys/netpfil/pf/killstate:v6 -> failed: Killing with the wrong IP = removed our state. [54.957s] > sys/netpfil/pf/nat64:pool -> failed: atf-check failed; see the = output of the test for details [51.118s] > sys/netpfil/pf/nat64:sctp_in -> failed: Failed to connect to SCTP = server [55.495s] > sys/netpfil/pf/nat64.py:TestNAT64::test_udp_checksum -> failed: = /usr/tests/sys/netpfil/pf/nat64.py:205: AttributeError [80.001s] > sys/netpfil/pf/pflog:matches -> failed: atf-check failed; see the = output of the test for details [61.807s] > sys/netpfil/pf/pflog:matches_logif -> failed: atf-check failed; see = the output of the test for details [51.261s] > sys/netpfil/pf/pflog:unspecified_v6 -> failed: atf-check failed; see = the output of the test for details [36.281s] > sys/netpfil/pf/rdr:srcport_pass -> failed: atf-check failed; see the = output of the test for details [76.815s] > sys/netpfil/pf/sctp:pfsync -> failed: Initial SCTP connection failed = [134.866s] > sys/netpfil/pf/sctp:related_icmp -> failed: SCTP connection failed = [59.620s] > sys/netpfil/pf/table:zero_one -> failed: atf-check failed; see the = output of the test for details [64.125s] > sys/netpfil/pf/tcp:rst -> failed: atf-check failed; see the output = of the test for details [80.261s] > Looking at it now, those runtimes seems suspiciously high as well. >=20 >=20 I noticed that. My local test shows most of them finish in seconds. Some = require more time to finish but still within about 20 seconds. I tested as the following, ``` # kyua test -k Kyuafile pf/divert-to ``` Is the failing tests always fail ? Does the kyua report show anything = useful ? > Best regards, > Kristof >=20 Best regards, Zhenlei --Apple-Mail=_34CB54E6-A347-4C15-86A2-3F0EA7E1F9D0 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Apr 14, 2026, at 11:57 PM, Kristof Provost <kp@FreeBSD.org> = wrote:

On 13 Apr 2026, at 6:39, Zhenlei Huang wrote:

The branch main has been updated by zlei:

URL: https://cgit.FreeBSD.org/src/commit/?id=3De9fc0c538264355bd3fd9= 120c650078281c2a290

commit = e9fc0c538264355bd3fd9120c650078281c2a290
Author: Zhenlei Huang zlei@FreeBSD.org
AuthorDate: 2026-04-13 04:38:44 +0000
Commit: Zhenlei Huang zlei@FreeBSD.org
CommitDate: 2026-04-13 04:38:44 +0000

if_clone: Make =
ifnet_detach_sxlock opaque to consumers

The change e133271fc1b5e introduced ifnet_detach_sxlock, and change
6d2a10d96fb5 widened its coverage, but there are still consumers,
net80211 and tuntap e.g., want it. Instead of sprinkling it everywhere,
make it opaque to consumers.

Out of tree drivers shall also benefit from this change.

Reviewed by:    kp
MFC after:      2 weeks
Differential Revision:  https://reviews.freebsd.org/D56298

I don=E2=80=99t really = understand why, but this commit causes bricoler test runs to fail for = me.
Or more precisely, it looks like parallel test runs see a number of = failures in the pf tests.

Reverting only = part of this change, specifically this:

diff --git a/sys/net/if_clone.c =
b/sys/net/if_clone.c
index db3db78c1b76..d0fe54b6146b 100644
--- a/sys/net/if_clone.c
+++ b/sys/net/if_clone.c
@@ -683,11 +683,12 @@ if_clone_detach(struct if_clone *ifc)
        V_if_cloners_count--;
        IF_CLONERS_UNLOCK();

-       sx_xlock(&ifnet_detach_sxlock);
        /* destroy all interfaces for this cloner */
-       while (!LIST_EMPTY(&ifc->ifc_iflist))
+       while (!LIST_EMPTY(&ifc->ifc_iflist)) {
+               sx_xlock(&ifnet_detach_sxlock);
                if_clone_destroyif_flags(ifc, =
LIST_FIRST(&ifc->ifc_iflist), IFC_F_FORCE);
-       sx_xunlock(&ifnet_detach_sxlock);
+               sx_xunlock(&ifnet_detach_sxlock);
+       }

        IF_CLONE_REMREF(ifc);
 }

Seems to be enough to avoid the = problem. So it looks like somehow deleting all interfaces for a given = cloner under that lock (rather than releasing it and re-acquiring it) = causes the test failures.



That is interesting. I do not expect a re-acquiring = will make any differences, given ifc_detach_cloner() and / or = if_clone_detach() are basically be invoked by VNET_SYSUNINIT() and =  vnet_destroy() has acquired ifnet_detach_sxlock ( I planed to = removed the acquiring of it in vnet_destroy() but not = yet done ). So I'd expect an assert `sx_assert(&ifnet_detach_sxlock, = SA_XLOCKED)` in if_clone_detach() will mostly = succeed.

Some drivers, for example = wg(4), iterate all VNETs and invoke ifc_detach_cloner() on module = unload. That should be the only case that `ifnet_detach_sxlock` is = not recursively acquired, and not acquired prior = to
the commit.

The failing tests I see are:

sys/netpfil/pf/divert-to:in_div  =
->  failed: Test case body returned a non-ok exit code, but this is =
not allowed  [36.591s]
sys/netpfil/pf/divert-to:in_div_in  ->  failed: atf-check failed; see =
the output of the test for details  [30.867s]
sys/netpfil/pf/divert-to:in_div_in_fwd_out_div_out  ->  failed: =
atf-check failed; see the output of the test for details  [39.705s]
sys/netpfil/pf/divert-to:in_dn_in_div_in_out_div_out_dn_out  ->  =
failed: atf-check failed; see the output of the test for details  =
[45.059s]
sys/netpfil/pf/divert-to:out_div  ->  failed: Test case body returned =
a non-ok exit code, but this is not allowed  [23.629s]
sys/netpfil/pf/killstate:gateway  ->  failed: Killing with a =
different gateway removed the state.  [32.803s]
sys/netpfil/pf/killstate:id  ->  failed: Killing a different ID =
removed the state.  [32.789s]
sys/netpfil/pf/killstate:label  ->  failed: Killing a non-existing =
label removed the state.  [47.749s]
sys/netpfil/pf/killstate:multilabel  ->  failed: Setting new rules =
removed the state.  [43.570s]
sys/netpfil/pf/killstate:src_dst  ->  failed: Killing with the wrong =
destination IP removed our state.  [57.819s]
sys/netpfil/pf/killstate:v6  ->  failed: Killing with the wrong IP =
removed our state.  [54.957s]
sys/netpfil/pf/nat64:pool  ->  failed: atf-check failed; see the =
output of the test for details  [51.118s]
sys/netpfil/pf/nat64:sctp_in  ->  failed: Failed to connect to SCTP =
server  [55.495s]
sys/netpfil/pf/nat64.py:TestNAT64::test_udp_checksum  ->  failed: =
/usr/tests/sys/netpfil/pf/nat64.py:205: AttributeError  [80.001s]
sys/netpfil/pf/pflog:matches  ->  failed: atf-check failed; see the =
output of the test for details  [61.807s]
sys/netpfil/pf/pflog:matches_logif  ->  failed: atf-check failed; see =
the output of the test for details  [51.261s]
sys/netpfil/pf/pflog:unspecified_v6  ->  failed: atf-check failed; =
see the output of the test for details  [36.281s]
sys/netpfil/pf/rdr:srcport_pass  ->  failed: atf-check failed; see =
the output of the test for details  [76.815s]
sys/netpfil/pf/sctp:pfsync  ->  failed: Initial SCTP connection =
failed  [134.866s]
sys/netpfil/pf/sctp:related_icmp  ->  failed: SCTP connection failed  =
[59.620s]
sys/netpfil/pf/table:zero_one  ->  failed: atf-check failed; see the =
output of the test for details  [64.125s]
sys/netpfil/pf/tcp:rst  ->  failed: atf-check failed; see the output =
of the test for details  [80.261s]

Looking at it now, those = runtimes seems suspiciously high as well.



I noticed that. My local test shows most of them finish = in seconds. Some require more time to finish but still within about 20 = seconds.
I tested as the = following,
```
# kyua test -k Kyuafile = pf/divert-to
```

Is the = failing tests always fail ? Does the kyua report show anything useful = ?

Best regards,
Kristof


Best regards,
Zhenlei

= --Apple-Mail=_34CB54E6-A347-4C15-86A2-3F0EA7E1F9D0--