Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 01 Sep 2025 08:51:30 +0200
From:      Kristof Provost <kp@FreeBSD.org>
To:        Mark Johnston <markj@freebsd.org>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: rtentry_free panic
Message-ID:  <4EC1D26D-4153-4683-AF38-C19E3CBE8FE8@FreeBSD.org>
In-Reply-To: <aKXxFZF82A9lyF0c@nuc>
References:  <163785B5-236A-4C19-8475-66E9E8912DFA@FreeBSD.org> <aKXxFZF82A9lyF0c@nuc>

next in thread | previous in thread | raw e-mail | index | archive | help

--=_MailMate_F158EF4C-DA90-41A3-B966-79E3AAC3BEA9_=
Content-Type: text/plain; charset=UTF-8; format=flowed; markup=markdown
Content-Transfer-Encoding: 8bit

On 20 Aug 2025, at 18:00, Mark Johnston wrote:
> On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost wrote:
>> We’re panicing because the V_rtzone zone has been cleaned up (in
>> vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make 
>> this
>> more obvious.
>> Note that we failed to completely free all rtentries (`Freed UMA keg
>> (rtentry) was not empty (2 items).  Lost 1 pages of memory.`). 
>> Presumably at
>> least on of those two gets freed later, and that’s the panic we 
>> see.
>>
>> rt_free() queues the actual delete as an epoch callback
>> (`NET_EPOCH_CALL(destroy_rtentry_epoch, &rt->rt_epoch_ctx);`), and 
>> that’s
>> what we see here: the zone is removed before we’re done freeing all 
>> of the
>> rtentries.
>>
>> vnet_rtzone_destroy() is called from rtables_destroy(), but that 
>> explicitly
>> calls NET_EPOCH_DRAIN_CALLBACKS() first, so I’d expect all of the 
>> pending
>> cleanups to have been done at that point.  The comment block above 
>> does
>> suggest that there may still be nexthop entries pending deletion even 
>> after
>> the we drain the callbacks. I think I can see how that’d happen for
>> nexthops, but I do not see how it can happen for rtentries.
>
> Is it possible that if_detach_internal()->rt_flushifroutes() is 
> running
> after the rtentry zone is being destroyed?  That is, maybe we're
> destroying interfaces too late in the jail teardown process?
>
With a little work to pass the calling function and line number through 
the call stack (and a lot of patience to reproduce the panic) I think 
I’ve found where we initially rt_free() the relevant rtentry, but 
it’s left me even more confused.

The call happens from ip6_destroy() -> in6_purgeaddr() -> 
ifa_del_loopback_route() -> ifa_maintain_loopback_route() -> 
rib_action() -> rib_del_route() -> rt_free().
That’s a NET_EPOCH_CALL(), which should be fine because in 
rtables_destroy() we NET_EPOCH_CALLBACK_DRAIN() before we 
vnet_rtzone_destory() (which naturally destroys the relevant uma zone).

ip6_destroy()’s VNET_SYSUNIT is SI_SUB_PROTO_DOMAIN/SI_ORDER_THIRD and 
rtables_destroy()’s is SI_SUB_PROTO_DOMAIN/SI_ORDER_FIRST. Given that 
it’s *un*init that means we call ip6_destroy() first, so that should 
all just work.
The enqueued freeing of the rtentries should all be handled once 
NET_EPOCH_CALLBACK_DRAIN completes, but that appears to not be the case.

—
Kristof
--=_MailMate_F158EF4C-DA90-41A3-B966-79E3AAC3BEA9_=
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE html>
<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/xhtml; charset=3Dutf-8"=
>
</head>
<body><div style=3D"font-family: sans-serif;"><div class=3D"markdown" sty=
le=3D"white-space: normal;">
<p dir=3D"auto">On 20 Aug 2025, at 18:00, Mark Johnston wrote:</p>
<blockquote style=3D"margin: 0 0 5px; padding-left: 5px; border-left: 2px=
 solid #136BCE; color: #136BCE;">
<p dir=3D"auto">On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost=
 wrote:</p>
<blockquote style=3D"margin: 0 0 5px; padding-left: 5px; border-left: 2px=
 solid #136BCE; border-left-color: #4B89CF; color: #4B89CF;">
<p dir=3D"auto">We=E2=80=99re panicing because the V_rtzone zone has been=
 cleaned up (in<br>
vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make this<=
br>
more obvious.<br>
Note that we failed to completely free all rtentries (<code style=3D"padd=
ing: 0 0.25em; background-color: #E4E4E4;">Freed UMA keg (rtentry) was no=
t empty (2 items).  Lost 1 pages of memory.</code>). Presumably at<br>
least on of those two gets freed later, and that=E2=80=99s the panic we s=
ee.</p>
<p dir=3D"auto">rt_free() queues the actual delete as an epoch callback<b=
r>
(<code style=3D"padding: 0 0.25em; background-color: #E4E4E4;">NET_EPOCH_=
CALL(destroy_rtentry_epoch, &amp;rt-&gt;rt_epoch_ctx);</code>), and that=E2=
=80=99s<br>
what we see here: the zone is removed before we=E2=80=99re done freeing a=
ll of the<br>
rtentries.</p>
<p dir=3D"auto">vnet_rtzone_destroy() is called from rtables_destroy(), b=
ut that explicitly<br>
calls NET_EPOCH_DRAIN_CALLBACKS() first, so I=E2=80=99d expect all of the=
 pending<br>
cleanups to have been done at that point.  The comment block above does<b=
r>
suggest that there may still be nexthop entries pending deletion even aft=
er<br>
the we drain the callbacks. I think I can see how that=E2=80=99d happen f=
or<br>
nexthops, but I do not see how it can happen for rtentries.</p>
</blockquote>
<p dir=3D"auto">Is it possible that if_detach_internal()-&gt;rt_flushifro=
utes() is running<br>
after the rtentry zone is being destroyed?  That is, maybe we're<br>
destroying interfaces too late in the jail teardown process?</p>
</blockquote>
<p dir=3D"auto">With a little work to pass the calling function and line =
number through the call stack (and a lot of patience to reproduce the pan=
ic) I think I=E2=80=99ve found where we initially rt_free() the relevant =
rtentry, but it=E2=80=99s left me even more confused.</p>
<p dir=3D"auto">The call happens from ip6_destroy() -&gt; in6_purgeaddr()=
 -&gt; ifa_del_loopback_route() -&gt; ifa_maintain_loopback_route() -&gt;=
 rib_action() -&gt; rib_del_route() -&gt; rt_free().<br>
That=E2=80=99s a NET_EPOCH_CALL(), which should be fine because in rtable=
s_destroy() we NET_EPOCH_CALLBACK_DRAIN() before we vnet_rtzone_destory()=
 (which naturally destroys the relevant uma zone).</p>
<p dir=3D"auto">ip6_destroy()=E2=80=99s VNET_SYSUNIT is SI_SUB_PROTO_DOMA=
IN/SI_ORDER_THIRD and rtables_destroy()=E2=80=99s is SI_SUB_PROTO_DOMAIN/=
SI_ORDER_FIRST. Given that it=E2=80=99s <em>un</em>init that means we cal=
l ip6_destroy() first, so that should all just work.<br>
The enqueued freeing of the rtentries should all be handled once NET_EPOC=
H_CALLBACK_DRAIN completes, but that appears to not be the case.</p>
<p dir=3D"auto">=E2=80=94<br>
Kristof</p>

</div>
</div>
</body>

</html>

--=_MailMate_F158EF4C-DA90-41A3-B966-79E3AAC3BEA9_=--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EC1D26D-4153-4683-AF38-C19E3CBE8FE8>