From nobody Mon Sep 1 06:51:30 2025 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cFffw64cbz66KR7 for ; Mon, 01 Sep 2025 06:51:36 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R13" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cFffw3rGRz3l0X; Mon, 01 Sep 2025 06:51:36 +0000 (UTC) (envelope-from kp@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1756709496; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0/CEL65RaDQloHcBgCB0/ZpEwWqFWE7ZMQzaGFse2KI=; b=incpL/UXK1jY201zpmO8pqm9QgNQ3T0rP8UfO5K/X4PfUQT9ZwPgPKQmaXuaueIjTwPD5A oF2oTHGsPssCtX5eHpcLT+zX+48d0Phdp6VlNcdNbxbfP33uGI7Sl9lx2bhicjkUzaj0eR /BxlV1FKyGKswgD9T4I2XjobBXMMynj13OjYfw+gHmpUKUsRGDNQG8VORQ41bwLThkLF7z ggb0kWmaBbD6dY6UCgiWxY575yBHmaSIinCE2GMUv0v8uRs3Om2oizrNam0x7T71qBl+I/ wGr86O/s+vsNjnACReMlkC4htgbxKkl9/VOpE9077D8xxnMQ7vECg9FMt8AvTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1756709496; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0/CEL65RaDQloHcBgCB0/ZpEwWqFWE7ZMQzaGFse2KI=; b=JLKjbePq85GvnzqxIefeeWlFNiRBa+FGltcakvEsCb7K8cnHXfExO7jNXMxvWdet7EBD1S hPSbzSRE3h+Uog9zWfn+huXFcCnigu9AX2Ar78Qnv3sOU1RSqAx9y4uUPWMSHoCuuv9Ar4 WcWLaeEI7UBCN9iq6hmgHkp4j4HA3hq8mH4fU0CF38/E7gufvRbXGL5/QuTXUEaWnOh0Ep TUhUgFeGzsGo4ISnhhdYKFxSQLbtpTCcGxggUrrFAyBZ5JXbdjdgU1BhakgtD+XRVrpcww N7yDlUoYu345cWCyG1CX8Uz1ItmoqsaYcoQj6hHhp+zA+XUi5TA4gqNr8FwHLg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1756709496; a=rsa-sha256; cv=none; b=ZaHi+djvQ/OpmhC9ZrVc4s4huOhnKoFhdlvASElkrD909EYl0tKbYNIjsrASJq076jz9eE AvtcMR/ByaW0Sith3JcaOCAY6W6bl6l1XvfWoGVkMa/6rJIcnB0HWwxWuVanNzOdcGVAX3 MZxkBLBMGX3u/MSKnTMI5f5zx8aGnrApzh42CP21/QW3C/pFZGjdOzc7+Wbu5HIigAAiUC 9fAkQMpupBgy58HNvu9RYD0jhSOOFRL/2yAHk2AiuaUoLB+xynJetxNpvd11isWhZMahy5 F+xEPsvscscAWBbz2CRgUrK3LvPYMEyE1+eJ0utRzOp+eX2IkwlH1hWxtRfMhg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mx1.codepro.be", Issuer "R11" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id 4cFffw1s8gzvn0; Mon, 01 Sep 2025 06:51:36 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id 606C7405E9; Mon, 01 Sep 2025 08:51:32 +0200 (CEST) From: Kristof Provost To: Mark Johnston Cc: FreeBSD Net Subject: Re: rtentry_free panic Date: Mon, 01 Sep 2025 08:51:30 +0200 X-Mailer: MailMate (2.0r6272) Message-ID: <4EC1D26D-4153-4683-AF38-C19E3CBE8FE8@FreeBSD.org> In-Reply-To: References: <163785B5-236A-4C19-8475-66E9E8912DFA@FreeBSD.org> List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_MailMate_F158EF4C-DA90-41A3-B966-79E3AAC3BEA9_=" Content-Transfer-Encoding: 8bit --=_MailMate_F158EF4C-DA90-41A3-B966-79E3AAC3BEA9_= Content-Type: text/plain; charset=UTF-8; format=flowed; markup=markdown Content-Transfer-Encoding: 8bit On 20 Aug 2025, at 18:00, Mark Johnston wrote: > On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost wrote: >> We’re panicing because the V_rtzone zone has been cleaned up (in >> vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make >> this >> more obvious. >> Note that we failed to completely free all rtentries (`Freed UMA keg >> (rtentry) was not empty (2 items). Lost 1 pages of memory.`). >> Presumably at >> least on of those two gets freed later, and that’s the panic we >> see. >> >> rt_free() queues the actual delete as an epoch callback >> (`NET_EPOCH_CALL(destroy_rtentry_epoch, &rt->rt_epoch_ctx);`), and >> that’s >> what we see here: the zone is removed before we’re done freeing all >> of the >> rtentries. >> >> vnet_rtzone_destroy() is called from rtables_destroy(), but that >> explicitly >> calls NET_EPOCH_DRAIN_CALLBACKS() first, so I’d expect all of the >> pending >> cleanups to have been done at that point. The comment block above >> does >> suggest that there may still be nexthop entries pending deletion even >> after >> the we drain the callbacks. I think I can see how that’d happen for >> nexthops, but I do not see how it can happen for rtentries. > > Is it possible that if_detach_internal()->rt_flushifroutes() is > running > after the rtentry zone is being destroyed? That is, maybe we're > destroying interfaces too late in the jail teardown process? > With a little work to pass the calling function and line number through the call stack (and a lot of patience to reproduce the panic) I think I’ve found where we initially rt_free() the relevant rtentry, but it’s left me even more confused. The call happens from ip6_destroy() -> in6_purgeaddr() -> ifa_del_loopback_route() -> ifa_maintain_loopback_route() -> rib_action() -> rib_del_route() -> rt_free(). That’s a NET_EPOCH_CALL(), which should be fine because in rtables_destroy() we NET_EPOCH_CALLBACK_DRAIN() before we vnet_rtzone_destory() (which naturally destroys the relevant uma zone). ip6_destroy()’s VNET_SYSUNIT is SI_SUB_PROTO_DOMAIN/SI_ORDER_THIRD and rtables_destroy()’s is SI_SUB_PROTO_DOMAIN/SI_ORDER_FIRST. Given that it’s *un*init that means we call ip6_destroy() first, so that should all just work. The enqueued freeing of the rtentries should all be handled once NET_EPOCH_CALLBACK_DRAIN completes, but that appears to not be the case. — Kristof --=_MailMate_F158EF4C-DA90-41A3-B966-79E3AAC3BEA9_= Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On 20 Aug 2025, at 18:00, Mark Johnston wrote:

On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost= wrote:

We=E2=80=99re panicing because the V_rtzone zone has been= cleaned up (in
vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make this<= br> more obvious.
Note that we failed to completely free all rtentries (Freed UMA keg (rtentry) was no= t empty (2 items). Lost 1 pages of memory.). Presumably at
least on of those two gets freed later, and that=E2=80=99s the panic we s= ee.

rt_free() queues the actual delete as an epoch callback (NET_EPOCH_= CALL(destroy_rtentry_epoch, &rt->rt_epoch_ctx);), and that=E2= =80=99s
what we see here: the zone is removed before we=E2=80=99re done freeing a= ll of the
rtentries.

vnet_rtzone_destroy() is called from rtables_destroy(), b= ut that explicitly
calls NET_EPOCH_DRAIN_CALLBACKS() first, so I=E2=80=99d expect all of the= pending
cleanups to have been done at that point. The comment block above does suggest that there may still be nexthop entries pending deletion even aft= er
the we drain the callbacks. I think I can see how that=E2=80=99d happen f= or
nexthops, but I do not see how it can happen for rtentries.

Is it possible that if_detach_internal()->rt_flushifro= utes() is running
after the rtentry zone is being destroyed? That is, maybe we're
destroying interfaces too late in the jail teardown process?

With a little work to pass the calling function and line = number through the call stack (and a lot of patience to reproduce the pan= ic) I think I=E2=80=99ve found where we initially rt_free() the relevant = rtentry, but it=E2=80=99s left me even more confused.

The call happens from ip6_destroy() -> in6_purgeaddr()= -> ifa_del_loopback_route() -> ifa_maintain_loopback_route() ->= rib_action() -> rib_del_route() -> rt_free().
That=E2=80=99s a NET_EPOCH_CALL(), which should be fine because in rtable= s_destroy() we NET_EPOCH_CALLBACK_DRAIN() before we vnet_rtzone_destory()= (which naturally destroys the relevant uma zone).

ip6_destroy()=E2=80=99s VNET_SYSUNIT is SI_SUB_PROTO_DOMA= IN/SI_ORDER_THIRD and rtables_destroy()=E2=80=99s is SI_SUB_PROTO_DOMAIN/= SI_ORDER_FIRST. Given that it=E2=80=99s uninit that means we cal= l ip6_destroy() first, so that should all just work.
The enqueued freeing of the rtentries should all be handled once NET_EPOC= H_CALLBACK_DRAIN completes, but that appears to not be the case.

=E2=80=94
Kristof

--=_MailMate_F158EF4C-DA90-41A3-B966-79E3AAC3BEA9_=--