From nobody Wed Aug 20 20:48:49 2025 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4c6dpk0xJGz64ybL for ; Wed, 20 Aug 2025 20:49:02 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4c6dpk0Jnlz3cJ7; Wed, 20 Aug 2025 20:49:02 +0000 (UTC) (envelope-from kp@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1755722942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h4UV/zOKYRzBIeQ81cCq2ED2j+Em/SdxZ+IQTah6JmM=; b=h88xzWnVz1zjxdj/nuPS+bUVR0qUugidh1z8yNU8sx3rl3HVLNsExwx44/qemsXXRwE8jx 1FE5jXZv3/Ow7yIgHSTQW/Fbf1ydklKRW40vJ1w8c8Jxkk3yiViDIRHBHuCW9wybw04xsz 4hm30vXYDiCokn7Frg6q8fXVJKvln8GVwQBtHpQZNqe59O0JFnnctCpRqH2V5YB6XWAwUL H98NJUXWnifcRa9Mpp+JWQojblR9KTXrf+pNqHE5OV+nZ5h4PMOP2jZXEYHsXgB49wZqyB 5++GQteaGKFIJdCVl/SC14+moHweMW8VSG+Pye6po9KZYeDEw2FKoii9UxalzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1755722942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h4UV/zOKYRzBIeQ81cCq2ED2j+Em/SdxZ+IQTah6JmM=; b=xj/8jdP2EpfmtYJMPX2lqgGtEedOYzHx2/ggMhrdI617WlzAQ7LwcEYIsx62OL2QWtXbq+ lZZTA9JjawNtTQGzScl4VNB1wQ5RslJuEfbPQqm1r8CKjViQ1uodWwXm4UEkaz1mxyajOK 2Rn4N5YICEhZnd9hdr/Da1SqhLj60sWUHaTK6bjmTCpThAPJDjHF+xBFo5XoLnis0XPJzW rPSPHx098BTDdsBboBDqecZ3p5fr73CdESVGeeWdr5Ny0NTB95Kqr5jgfbtwdpaMxHWS/L eLrlzwalndLPHllZCr6L1Sbzu/KDvSiKTGtT2dI3wgICtcqVcpNxcI1TSXl5YQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1755722942; a=rsa-sha256; cv=none; b=OtSWCpVIRG3Od8e3fuw/WYCDqMQhMxT6UoBFxWwsXb4MjppBrZVE+905USE+o0FjPfOGWO MwVY4W6lWGIqrYjiMNkY0gT6bPriG0E7i/xUqjf24lVGMcJdHPXWVuyR/QYA8CrjUu1550 WXF6g53E4S1e0rOBXtnF6l+enE1V487ApqL+Z1xxS88zKygJ1SMPW0vk+bxRZymvnSsLuY HIpzeWid4YRjVSNzoUOJBT9MONtCL8nF8q2tEXKtADB7rNiyCdYDfnUH+itBnegNZ0wW/a M2ML1pQ1AmQKag9eZsnV84T27Ca8W6N5XOqlIoCmAkrxHtUcPTj+giZuU3kOWA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mx1.codepro.be", Issuer "R11" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id 4c6dpj5NvWzwSw; Wed, 20 Aug 2025 20:49:01 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id 5D44057748; Wed, 20 Aug 2025 22:48:59 +0200 (CEST) From: Kristof Provost To: Mark Johnston Cc: FreeBSD Net Subject: Re: rtentry_free panic Date: Wed, 20 Aug 2025 22:48:49 +0200 X-Mailer: MailMate (2.0r6272) Message-ID: <9700813A-C1C9-4116-BD3F-390508EACB4C@FreeBSD.org> In-Reply-To: References: <163785B5-236A-4C19-8475-66E9E8912DFA@FreeBSD.org> List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_MailMate_1C234D9B-0637-4E99-BDAD-E47B67828C8A_=" Content-Transfer-Encoding: 8bit --=_MailMate_1C234D9B-0637-4E99-BDAD-E47B67828C8A_= Content-Type: text/plain; charset=UTF-8; format=flowed; markup=markdown Content-Transfer-Encoding: 8bit On 20 Aug 2025, at 18:00, Mark Johnston wrote: > On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost wrote: >> Hi, >> >> Running the pf tests I very occasional (say 1 out of 10 runs) see >> panics >> freeing an rtentry. >> This mostly manifests during bricoler test runs, and usually with the >> KMSAN >> kernel config. I assume that’s because there’s a timing factor >> involved >> rather than it being an issue that’s directly detected by >> KMSAN/KASAN. > > I've seen this before, but not in the past few months. I'm running > with > the default parallelism of 4 most of the time. > I have the distinct impression (but no data to prove it) that it comes and goes. >> We’re panicing because the V_rtzone zone has been cleaned up (in >> vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make >> this >> more obvious. >> Note that we failed to completely free all rtentries (`Freed UMA keg >> (rtentry) was not empty (2 items). Lost 1 pages of memory.`). >> Presumably at >> least on of those two gets freed later, and that’s the panic we >> see. >> >> rt_free() queues the actual delete as an epoch callback >> (`NET_EPOCH_CALL(destroy_rtentry_epoch, &rt->rt_epoch_ctx);`), and >> that’s >> what we see here: the zone is removed before we’re done freeing all >> of the >> rtentries. >> >> vnet_rtzone_destroy() is called from rtables_destroy(), but that >> explicitly >> calls NET_EPOCH_DRAIN_CALLBACKS() first, so I’d expect all of the >> pending >> cleanups to have been done at that point. The comment block above >> does >> suggest that there may still be nexthop entries pending deletion even >> after >> the we drain the callbacks. I think I can see how that’d happen for >> nexthops, but I do not see how it can happen for rtentries. > > Is it possible that if_detach_internal()->rt_flushifroutes() is > running > after the rtentry zone is being destroyed? That is, maybe we're > destroying interfaces too late in the jail teardown process? > I don’t think so, I expect all of the if_detach() calls to be done by the time we hit rtables_destroy() -> vnet_rtzone_destroy(), because that’s SI_SUB_PROTO_DOMAIN/SI_ORDER_FIRST. We should have hit vnet_if_return() (SI_SUB_VNET_DONE/SI_ORDER_ANY) by then. SI_SUB_VNET_DONE is 0xdc00000, SI_SUB_PROTO_DOMAIN is 0x8800000 and the vnet_uninit calls are done in descending order, so VNET_DONE should be first. I’m going to kick off a few test runs where I assert that V_rtzone hasn’t been freed yet when we’re in if_detach_interal() to confirm, because clearly I’m missing *something*, and it could be this. — Kristof --=_MailMate_1C234D9B-0637-4E99-BDAD-E47B67828C8A_= Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On 20 Aug 2025, at 18:00, Mark Johnston wrote:

On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost= wrote:

Hi,

Running the pf tests I very occasional (say 1 out of 10 r= uns) see panics
freeing an rtentry.
This mostly manifests during bricoler test runs, and usually with the KMS= AN
kernel config. I assume that=E2=80=99s because there=E2=80=99s a timing f= actor involved
rather than it being an issue that=E2=80=99s directly detected by KMSAN/K= ASAN.

I've seen this before, but not in the past few months. I= 'm running with
the default parallelism of 4 most of the time.

I have the distinct impression (but no data to prove it) = that it comes and goes.

We=E2=80=99re panicing because the V_rtzone zone has been= cleaned up (in
vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make this<= br> more obvious.
Note that we failed to completely free all rtentries (Freed UMA keg (rtentry) was no= t empty (2 items). Lost 1 pages of memory.). Presumably at
least on of those two gets freed later, and that=E2=80=99s the panic we s= ee.

rt_free() queues the actual delete as an epoch callback (NET_EPOCH_= CALL(destroy_rtentry_epoch, &rt->rt_epoch_ctx);), and that=E2= =80=99s
what we see here: the zone is removed before we=E2=80=99re done freeing a= ll of the
rtentries.

vnet_rtzone_destroy() is called from rtables_destroy(), b= ut that explicitly
calls NET_EPOCH_DRAIN_CALLBACKS() first, so I=E2=80=99d expect all of the= pending
cleanups to have been done at that point. The comment block above does suggest that there may still be nexthop entries pending deletion even aft= er
the we drain the callbacks. I think I can see how that=E2=80=99d happen f= or
nexthops, but I do not see how it can happen for rtentries.

Is it possible that if_detach_internal()->rt_flushifro= utes() is running
after the rtentry zone is being destroyed? That is, maybe we're
destroying interfaces too late in the jail teardown process?

I don=E2=80=99t think so, I expect all of the if_detach()= calls to be done by the time we hit rtables_destroy() -> vnet_rtzone_= destroy(), because that=E2=80=99s SI_SUB_PROTO_DOMAIN/SI_ORDER_FIRST.
= We should have hit vnet_if_return() (SI_SUB_VNET_DONE/SI_ORDER_ANY) by th= en.

SI_SUB_VNET_DONE is 0xdc00000, SI_SUB_PROTO_DOMAIN is 0x8= 800000 and the vnet_uninit calls are done in descending order, so VNET_DO= NE should be first.

I=E2=80=99m going to kick off a few test runs where I ass= ert that V_rtzone hasn=E2=80=99t been freed yet when we=E2=80=99re in if_= detach_interal() to confirm, because clearly I=E2=80=99m missing some= thing, and it could be this.

=E2=80=94
Kristof

--=_MailMate_1C234D9B-0637-4E99-BDAD-E47B67828C8A_=--