Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Apr 2026 17:57:41 +0200
From:      Kristof Provost <kp@FreeBSD.org>
To:        Zhenlei Huang <zlei@FreeBSD.org>
Cc:        src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org
Subject:   Re: git: e9fc0c538264 - main - if_clone: Make ifnet_detach_sxlock opaque to consumers
Message-ID:  <AAA103BC-755C-4522-B57F-4DC455555BAA@FreeBSD.org>
In-Reply-To: <69dc7394.203c3.3a41c692@gitrepo.freebsd.org>

index | next in thread | previous in thread | raw e-mail

[-- Attachment #1 --]
On 13 Apr 2026, at 6:39, Zhenlei Huang wrote:
> The branch main has been updated by zlei:
>
> URL: 
> https://cgit.FreeBSD.org/src/commit/?id=e9fc0c538264355bd3fd9120c650078281c2a290
>
> commit e9fc0c538264355bd3fd9120c650078281c2a290
> Author:     Zhenlei Huang <zlei@FreeBSD.org>
> AuthorDate: 2026-04-13 04:38:44 +0000
> Commit:     Zhenlei Huang <zlei@FreeBSD.org>
> CommitDate: 2026-04-13 04:38:44 +0000
>
>     if_clone: Make ifnet_detach_sxlock opaque to consumers
>
>     The change e133271fc1b5e introduced ifnet_detach_sxlock, and 
> change
>     6d2a10d96fb5 widened its coverage, but there are still consumers,
>     net80211 and tuntap e.g., want it. Instead of sprinkling it 
> everywhere,
>     make it opaque to consumers.
>
>     Out of tree drivers shall also benefit from this change.
>
>     Reviewed by:    kp
>     MFC after:      2 weeks
>     Differential Revision:  https://reviews.freebsd.org/D56298

I don’t really understand why, but this commit causes bricoler test 
runs to fail for me.
Or more precisely, it looks like parallel test runs see a number of 
failures in the pf tests.

Reverting only part of this change, specifically this:

	diff --git a/sys/net/if_clone.c b/sys/net/if_clone.c
	index db3db78c1b76..d0fe54b6146b 100644
	--- a/sys/net/if_clone.c
	+++ b/sys/net/if_clone.c
	@@ -683,11 +683,12 @@ if_clone_detach(struct if_clone *ifc)
	        V_if_cloners_count--;
	        IF_CLONERS_UNLOCK();

	-       sx_xlock(&ifnet_detach_sxlock);
	        /* destroy all interfaces for this cloner */
	-       while (!LIST_EMPTY(&ifc->ifc_iflist))
	+       while (!LIST_EMPTY(&ifc->ifc_iflist)) {
	+               sx_xlock(&ifnet_detach_sxlock);
	                if_clone_destroyif_flags(ifc, 
LIST_FIRST(&ifc->ifc_iflist), IFC_F_FORCE);
	-       sx_xunlock(&ifnet_detach_sxlock);
	+               sx_xunlock(&ifnet_detach_sxlock);
	+       }

	        IF_CLONE_REMREF(ifc);
	 }

Seems to be enough to avoid the problem. So it looks like somehow 
deleting all interfaces for a given cloner under that lock (rather than 
releasing it and re-acquiring it) causes the test failures.

The failing tests I see are:

	sys/netpfil/pf/divert-to:in_div  ->  failed: Test case body returned a 
non-ok exit code, but this is not allowed  [36.591s]
	sys/netpfil/pf/divert-to:in_div_in  ->  failed: atf-check failed; see 
the output of the test for details  [30.867s]
	sys/netpfil/pf/divert-to:in_div_in_fwd_out_div_out  ->  failed: 
atf-check failed; see the output of the test for details  [39.705s]
	sys/netpfil/pf/divert-to:in_dn_in_div_in_out_div_out_dn_out  ->  
failed: atf-check failed; see the output of the test for details  
[45.059s]
	sys/netpfil/pf/divert-to:out_div  ->  failed: Test case body returned a 
non-ok exit code, but this is not allowed  [23.629s]
	sys/netpfil/pf/killstate:gateway  ->  failed: Killing with a different 
gateway removed the state.  [32.803s]
	sys/netpfil/pf/killstate:id  ->  failed: Killing a different ID removed 
the state.  [32.789s]
	sys/netpfil/pf/killstate:label  ->  failed: Killing a non-existing 
label removed the state.  [47.749s]
	sys/netpfil/pf/killstate:multilabel  ->  failed: Setting new rules 
removed the state.  [43.570s]
	sys/netpfil/pf/killstate:src_dst  ->  failed: Killing with the wrong 
destination IP removed our state.  [57.819s]
	sys/netpfil/pf/killstate:v6  ->  failed: Killing with the wrong IP 
removed our state.  [54.957s]
	sys/netpfil/pf/nat64:pool  ->  failed: atf-check failed; see the output 
of the test for details  [51.118s]
	sys/netpfil/pf/nat64:sctp_in  ->  failed: Failed to connect to SCTP 
server  [55.495s]
	sys/netpfil/pf/nat64.py:TestNAT64::test_udp_checksum  ->  failed: 
/usr/tests/sys/netpfil/pf/nat64.py:205: AttributeError  [80.001s]
	sys/netpfil/pf/pflog:matches  ->  failed: atf-check failed; see the 
output of the test for details  [61.807s]
	sys/netpfil/pf/pflog:matches_logif  ->  failed: atf-check failed; see 
the output of the test for details  [51.261s]
	sys/netpfil/pf/pflog:unspecified_v6  ->  failed: atf-check failed; see 
the output of the test for details  [36.281s]
	sys/netpfil/pf/rdr:srcport_pass  ->  failed: atf-check failed; see the 
output of the test for details  [76.815s]
	sys/netpfil/pf/sctp:pfsync  ->  failed: Initial SCTP connection failed  
[134.866s]
	sys/netpfil/pf/sctp:related_icmp  ->  failed: SCTP connection failed  
[59.620s]
	sys/netpfil/pf/table:zero_one  ->  failed: atf-check failed; see the 
output of the test for details  [64.125s]
	sys/netpfil/pf/tcp:rst  ->  failed: atf-check failed; see the output of 
the test for details  [80.261s]

Looking at it now, those runtimes seems suspiciously high as well.

Best regards,
Kristof
[-- Attachment #2 --]
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/xhtml; charset=utf-8">
</head>
<body><div style="font-family: sans-serif;"><div class="markdown" style="white-space: normal;">
<p dir="auto">On 13 Apr 2026, at 6:39, Zhenlei Huang wrote:</p>
<blockquote style="margin: 0 0 5px; padding-left: 5px; border-left: 2px solid #136BCE; color: #136BCE;">
<p dir="auto">The branch main has been updated by zlei:</p>
<p dir="auto">URL: <a href="https://cgit.FreeBSD.org/src/commit/?id=e9fc0c538264355bd3fd9120c650078281c2a290">https://cgit.FreeBSD.org/src/commit/?id=e9fc0c538264355bd3fd9120c650078281c2a290</a></p>;
<p dir="auto">commit e9fc0c538264355bd3fd9120c650078281c2a290<br>
Author:     Zhenlei Huang <a href="mailto:zlei@FreeBSD.org">zlei@FreeBSD.org</a><br>
AuthorDate: 2026-04-13 04:38:44 +0000<br>
Commit:     Zhenlei Huang <a href="mailto:zlei@FreeBSD.org">zlei@FreeBSD.org</a><br>
CommitDate: 2026-04-13 04:38:44 +0000</p>
<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; border: thin solid gray; overflow-x: auto; max-width: 90vw; background-color: #E4E4E4;"><code style="padding: 0 0.25em; background-color: #E4E4E4;">if_clone: Make ifnet_detach_sxlock opaque to consumers

The change e133271fc1b5e introduced ifnet_detach_sxlock, and change
6d2a10d96fb5 widened its coverage, but there are still consumers,
net80211 and tuntap e.g., want it. Instead of sprinkling it everywhere,
make it opaque to consumers.

Out of tree drivers shall also benefit from this change.

Reviewed by:    kp
MFC after:      2 weeks
Differential Revision:  https://reviews.freebsd.org/D56298
</code></pre>
</blockquote>
<p dir="auto">I don’t really understand why, but this commit causes bricoler test runs to fail for me.<br>
Or more precisely, it looks like parallel test runs see a number of failures in the pf tests.</p>
<p dir="auto">Reverting only part of this change, specifically this:</p>
<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; border: thin solid gray; overflow-x: auto; max-width: 90vw; background-color: #E4E4E4;"><code style="padding: 0 0.25em; background-color: #E4E4E4;">diff --git a/sys/net/if_clone.c b/sys/net/if_clone.c
index db3db78c1b76..d0fe54b6146b 100644
--- a/sys/net/if_clone.c
+++ b/sys/net/if_clone.c
@@ -683,11 +683,12 @@ if_clone_detach(struct if_clone *ifc)
        V_if_cloners_count--;
        IF_CLONERS_UNLOCK();

-       sx_xlock(&amp;ifnet_detach_sxlock);
        /* destroy all interfaces for this cloner */
-       while (!LIST_EMPTY(&amp;ifc-&gt;ifc_iflist))
+       while (!LIST_EMPTY(&amp;ifc-&gt;ifc_iflist)) {
+               sx_xlock(&amp;ifnet_detach_sxlock);
                if_clone_destroyif_flags(ifc, LIST_FIRST(&amp;ifc-&gt;ifc_iflist), IFC_F_FORCE);
-       sx_xunlock(&amp;ifnet_detach_sxlock);
+               sx_xunlock(&amp;ifnet_detach_sxlock);
+       }

        IF_CLONE_REMREF(ifc);
 }
</code></pre>
<p dir="auto">Seems to be enough to avoid the problem. So it looks like somehow deleting all interfaces for a given cloner under that lock (rather than releasing it and re-acquiring it) causes the test failures.</p>
<p dir="auto">The failing tests I see are:</p>
<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; border: thin solid gray; overflow-x: auto; max-width: 90vw; background-color: #E4E4E4;"><code style="padding: 0 0.25em; background-color: #E4E4E4;">sys/netpfil/pf/divert-to:in_div  -&gt;  failed: Test case body returned a non-ok exit code, but this is not allowed  [36.591s]
sys/netpfil/pf/divert-to:in_div_in  -&gt;  failed: atf-check failed; see the output of the test for details  [30.867s]
sys/netpfil/pf/divert-to:in_div_in_fwd_out_div_out  -&gt;  failed: atf-check failed; see the output of the test for details  [39.705s]
sys/netpfil/pf/divert-to:in_dn_in_div_in_out_div_out_dn_out  -&gt;  failed: atf-check failed; see the output of the test for details  [45.059s]
sys/netpfil/pf/divert-to:out_div  -&gt;  failed: Test case body returned a non-ok exit code, but this is not allowed  [23.629s]
sys/netpfil/pf/killstate:gateway  -&gt;  failed: Killing with a different gateway removed the state.  [32.803s]
sys/netpfil/pf/killstate:id  -&gt;  failed: Killing a different ID removed the state.  [32.789s]
sys/netpfil/pf/killstate:label  -&gt;  failed: Killing a non-existing label removed the state.  [47.749s]
sys/netpfil/pf/killstate:multilabel  -&gt;  failed: Setting new rules removed the state.  [43.570s]
sys/netpfil/pf/killstate:src_dst  -&gt;  failed: Killing with the wrong destination IP removed our state.  [57.819s]
sys/netpfil/pf/killstate:v6  -&gt;  failed: Killing with the wrong IP removed our state.  [54.957s]
sys/netpfil/pf/nat64:pool  -&gt;  failed: atf-check failed; see the output of the test for details  [51.118s]
sys/netpfil/pf/nat64:sctp_in  -&gt;  failed: Failed to connect to SCTP server  [55.495s]
sys/netpfil/pf/nat64.py:TestNAT64::test_udp_checksum  -&gt;  failed: /usr/tests/sys/netpfil/pf/nat64.py:205: AttributeError  [80.001s]
sys/netpfil/pf/pflog:matches  -&gt;  failed: atf-check failed; see the output of the test for details  [61.807s]
sys/netpfil/pf/pflog:matches_logif  -&gt;  failed: atf-check failed; see the output of the test for details  [51.261s]
sys/netpfil/pf/pflog:unspecified_v6  -&gt;  failed: atf-check failed; see the output of the test for details  [36.281s]
sys/netpfil/pf/rdr:srcport_pass  -&gt;  failed: atf-check failed; see the output of the test for details  [76.815s]
sys/netpfil/pf/sctp:pfsync  -&gt;  failed: Initial SCTP connection failed  [134.866s]
sys/netpfil/pf/sctp:related_icmp  -&gt;  failed: SCTP connection failed  [59.620s]
sys/netpfil/pf/table:zero_one  -&gt;  failed: atf-check failed; see the output of the test for details  [64.125s]
sys/netpfil/pf/tcp:rst  -&gt;  failed: atf-check failed; see the output of the test for details  [80.261s]
</code></pre>
<p dir="auto">Looking at it now, those runtimes seems suspiciously high as well.</p>
<p dir="auto">Best regards,<br>
Kristof</p>

</div>
</div>
</body>

</html>
home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AAA103BC-755C-4522-B57F-4DC455555BAA>