FreeBSD Mail Archives

Date:      Wed, 20 Aug 2025 22:48:49 +0200
From:      Kristof Provost <kp@FreeBSD.org>
To:        Mark Johnston <markj@freebsd.org>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: rtentry_free panic
Message-ID:  <9700813A-C1C9-4116-BD3F-390508EACB4C@FreeBSD.org>
In-Reply-To: <aKXxFZF82A9lyF0c@nuc>
References:  <163785B5-236A-4C19-8475-66E9E8912DFA@FreeBSD.org> <aKXxFZF82A9lyF0c@nuc>


--=_MailMate_1C234D9B-0637-4E99-BDAD-E47B67828C8A_=
Content-Type: text/plain; charset=UTF-8; format=flowed; markup=markdown
Content-Transfer-Encoding: 8bit

On 20 Aug 2025, at 18:00, Mark Johnston wrote:
> On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost wrote:
>> Hi,
>>
>> Running the pf tests I very occasional (say 1 out of 10 runs) see 
>> panics
>> freeing an rtentry.
>> This mostly manifests during bricoler test runs, and usually with the 
>> KMSAN
>> kernel config. I assume that’s because there’s a timing factor 
>> involved
>> rather than it being an issue that’s directly detected by 
>> KMSAN/KASAN.
>
> I've seen this before, but not in the past few months.  I'm running 
> with
> the default parallelism of 4 most of the time.
>
I have the distinct impression (but no data to prove it) that it comes 
and goes.

>> We’re panicing because the V_rtzone zone has been cleaned up (in
>> vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make 
>> this
>> more obvious.
>> Note that we failed to completely free all rtentries (`Freed UMA keg
>> (rtentry) was not empty (2 items).  Lost 1 pages of memory.`). 
>> Presumably at
>> least on of those two gets freed later, and that’s the panic we 
>> see.
>>
>> rt_free() queues the actual delete as an epoch callback
>> (`NET_EPOCH_CALL(destroy_rtentry_epoch, &rt->rt_epoch_ctx);`), and 
>> that’s
>> what we see here: the zone is removed before we’re done freeing all 
>> of the
>> rtentries.
>>
>> vnet_rtzone_destroy() is called from rtables_destroy(), but that 
>> explicitly
>> calls NET_EPOCH_DRAIN_CALLBACKS() first, so I’d expect all of the 
>> pending
>> cleanups to have been done at that point.  The comment block above 
>> does
>> suggest that there may still be nexthop entries pending deletion even 
>> after
>> the we drain the callbacks. I think I can see how that’d happen for
>> nexthops, but I do not see how it can happen for rtentries.
>
> Is it possible that if_detach_internal()->rt_flushifroutes() is 
> running
> after the rtentry zone is being destroyed?  That is, maybe we're
> destroying interfaces too late in the jail teardown process?
>
I don’t think so, I expect all of the if_detach() calls to be done by 
the time we hit rtables_destroy() -> vnet_rtzone_destroy(), because 
that’s SI_SUB_PROTO_DOMAIN/SI_ORDER_FIRST.
We should have hit vnet_if_return() (SI_SUB_VNET_DONE/SI_ORDER_ANY) by 
then.

SI_SUB_VNET_DONE is 0xdc00000, SI_SUB_PROTO_DOMAIN is 0x8800000 and the 
vnet_uninit calls are done in descending order, so VNET_DONE should be 
first.

I’m going to kick off a few test runs where I assert that V_rtzone 
hasn’t been freed yet when we’re in if_detach_interal() to confirm, 
because clearly I’m missing *something*, and it could be this.

—
Kristof
--=_MailMate_1C234D9B-0637-4E99-BDAD-E47B67828C8A_=
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE html>
<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/xhtml; charset=3Dutf-8"=
>
</head>
<body><div style=3D"font-family: sans-serif;"><div class=3D"markdown" sty=
le=3D"white-space: normal;">
<p dir=3D"auto">On 20 Aug 2025, at 18:00, Mark Johnston wrote:</p>
<blockquote style=3D"margin: 0 0 5px; padding-left: 5px; border-left: 2px=
 solid #136BCE; color: #136BCE;">
<p dir=3D"auto">On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost=
 wrote:</p>
<blockquote style=3D"margin: 0 0 5px; padding-left: 5px; border-left: 2px=
 solid #136BCE; border-left-color: #4B89CF; color: #4B89CF;">
<p dir=3D"auto">Hi,</p>
<p dir=3D"auto">Running the pf tests I very occasional (say 1 out of 10 r=
uns) see panics<br>
freeing an rtentry.<br>
This mostly manifests during bricoler test runs, and usually with the KMS=
AN<br>
kernel config. I assume that=E2=80=99s because there=E2=80=99s a timing f=
actor involved<br>
rather than it being an issue that=E2=80=99s directly detected by KMSAN/K=
ASAN.</p>
</blockquote>
<p dir=3D"auto">I've seen this before, but not in the past few months.  I=
'm running with<br>
the default parallelism of 4 most of the time.</p>
</blockquote>
<p dir=3D"auto">I have the distinct impression (but no data to prove it) =
that it comes and goes.</p>
<blockquote style=3D"margin: 0 0 5px; padding-left: 5px; border-left: 2px=
 solid #136BCE; color: #136BCE;">
<blockquote style=3D"margin: 0 0 5px; padding-left: 5px; border-left: 2px=
 solid #136BCE; border-left-color: #4B89CF; color: #4B89CF;">
<p dir=3D"auto">We=E2=80=99re panicing because the V_rtzone zone has been=
 cleaned up (in<br>
vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make this<=
br>
more obvious.<br>
Note that we failed to completely free all rtentries (<code style=3D"padd=
ing: 0 0.25em; background-color: #E4E4E4;">Freed UMA keg (rtentry) was no=
t empty (2 items).  Lost 1 pages of memory.</code>). Presumably at<br>
least on of those two gets freed later, and that=E2=80=99s the panic we s=
ee.</p>
<p dir=3D"auto">rt_free() queues the actual delete as an epoch callback<b=
r>
(<code style=3D"padding: 0 0.25em; background-color: #E4E4E4;">NET_EPOCH_=
CALL(destroy_rtentry_epoch, &amp;rt-&gt;rt_epoch_ctx);</code>), and that=E2=
=80=99s<br>
what we see here: the zone is removed before we=E2=80=99re done freeing a=
ll of the<br>
rtentries.</p>
<p dir=3D"auto">vnet_rtzone_destroy() is called from rtables_destroy(), b=
ut that explicitly<br>
calls NET_EPOCH_DRAIN_CALLBACKS() first, so I=E2=80=99d expect all of the=
 pending<br>
cleanups to have been done at that point.  The comment block above does<b=
r>
suggest that there may still be nexthop entries pending deletion even aft=
er<br>
the we drain the callbacks. I think I can see how that=E2=80=99d happen f=
or<br>
nexthops, but I do not see how it can happen for rtentries.</p>
</blockquote>
<p dir=3D"auto">Is it possible that if_detach_internal()-&gt;rt_flushifro=
utes() is running<br>
after the rtentry zone is being destroyed?  That is, maybe we're<br>
destroying interfaces too late in the jail teardown process?</p>
</blockquote>
<p dir=3D"auto">I don=E2=80=99t think so, I expect all of the if_detach()=
 calls to be done by the time we hit rtables_destroy() -&gt; vnet_rtzone_=
destroy(), because that=E2=80=99s SI_SUB_PROTO_DOMAIN/SI_ORDER_FIRST.<br>=

We should have hit vnet_if_return() (SI_SUB_VNET_DONE/SI_ORDER_ANY) by th=
en.</p>
<p dir=3D"auto">SI_SUB_VNET_DONE is 0xdc00000, SI_SUB_PROTO_DOMAIN is 0x8=
800000 and the vnet_uninit calls are done in descending order, so VNET_DO=
NE should be first.</p>
<p dir=3D"auto">I=E2=80=99m going to kick off a few test runs where I ass=
ert that V_rtzone hasn=E2=80=99t been freed yet when we=E2=80=99re in if_=
detach_interal() to confirm, because clearly I=E2=80=99m missing <em>some=
thing</em>, and it could be this.</p>
<p dir=3D"auto">=E2=80=94<br>
Kristof</p>

</div>
</div>
</body>

</html>

--=_MailMate_1C234D9B-0637-4E99-BDAD-E47B67828C8A_=--

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9700813A-C1C9-4116-BD3F-390508EACB4C>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation