Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Oct 2023 09:49:58 +0800
From:      Zhenlei Huang <zlei@FreeBSD.org>
To:        Kristof Provost <kp@FreeBSD.org>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, Alexander Chernikov <melifaro@freebsd.org>
Subject:   Re: fib6_lookup() returning deleted struct ifnet
Message-ID:  <3A4AA88F-E352-46DC-81DB-7408CD0A4D77@FreeBSD.org>
In-Reply-To: <A5BC4385-AA39-4608-B1BA-0551AFBB49CD@FreeBSD.org>
References:  <A5BC4385-AA39-4608-B1BA-0551AFBB49CD@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8



> On Oct 25, 2023, at 11:27 PM, Kristof Provost <kp@FreeBSD.org> wrote:
>=20
> Hi,
>=20
> Several pfSense users report IPv6-related panics when an interface is =
deleted.
> The relevant bug reports are https://redmine.pfsense.org/issues/14164 =
<https://redmine.pfsense.org/issues/14164>; and =
https://redmine.pfsense.org/issues/14431 =
<https://redmine.pfsense.org/issues/14431>.
> The latest report is for a build that includes commits up to =
1a18383a52bc373e316d224cef1298debf6f7e25 (=E2=80=9Clibcrypto: link =
engines and the legacy provider to libcrypto=E2=80=9D, September 15th).
>=20
> I believe all reports are for users running PPPoE, via netgraph, but =
that might be coincidental, as that=E2=80=99s the most likely way for =
interfaces to be destroyed (when PPP disconnects and reconnects).
>=20
> There are a few different backtraces, but they appear to have the same =
root cause, so I=E2=80=99ll focus on one of them:
>=20
> db:1:pfs> bt
> Tracing pid 2 tid 100041 td 0xfffffe0085264560
> kdb_enter() at kdb_enter+0x32/frame 0xfffffe00850ad910
> vpanic() at vpanic+0x183/frame 0xfffffe00850ad960
> panic() at panic+0x43/frame 0xfffffe00850ad9c0
> trap_fatal() at trap_fatal+0x409/frame 0xfffffe00850ada20
> trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00850ada80
> calltrap() at calltrap+0x8/frame 0xfffffe00850ada80
> --- trap 0xc, rip =3D 0xffffffff80f5a036, rsp =3D 0xfffffe00850adb50, =
rbp =3D 0xfffffe00850adb80 ---
> in6_selecthlim() at in6_selecthlim+0x96/frame 0xfffffe00850adb80
> tcp_default_output() at tcp_default_output+0x1ded/frame =
0xfffffe00850add70
> tcp_timer_rexmt() at tcp_timer_rexmt+0x514/frame 0xfffffe00850addd0
> tcp_timer_enter() at tcp_timer_enter+0x102/frame 0xfffffe00850ade10
> softclock_call_cc() at softclock_call_cc+0x13c/frame =
0xfffffe00850adec0
> softclock_thread() at softclock_thread+0xe9/frame 0xfffffe00850adef0
> fork_exit() at fork_exit+0x7d/frame 0xfffffe00850adf30
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00850adf30
> --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---
> This happens in the TCP output path, where we look up the hop limit =
for a specific destination. I=E2=80=99ve obtained a core dump for such a =
crash, and I believe the panic happens on line =
https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861 =
<https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861>;
> The call in tcp_default_output() is in6_selecthlim(int, NULL);, so we =
don=E2=80=99t get an ifp from the caller, but instead perform a route =
lookup, and try to obtain the hop limit through ND_IFINFO(nh->nh_ifp). =
This panics because the afdata[AF_INET6] pointer is NULL. The core dump =
shows a deleted structure ifnet:
>=20
>=20

`egrep -r 'if_afdata\[AF_INET6\]\s*[!=3D]=3D\s*NULL' sys/netinet6'` =
shows there're many places do the NULL check. I think we can do it in =
in6_selecthlim() as a workaround.
=20
> (kgdb) p *(struct ifnet *)0xfffff80203712800
> $3 =3D {
>   if_link =3D {
>     cstqe_next =3D 0x0
>   },
>   if_clones =3D {
>     le_next =3D 0x0,
>     le_prev =3D 0x0
>   },
>   if_groups =3D {
>     cstqh_first =3D 0x0,
>     cstqh_last =3D 0xfffff80203712818
>   },
>   if_alloctype =3D 53 '5',
>   if_numa_domain =3D 255 '\377',
>   if_softc =3D 0xfffff80103447a00,
>   if_llsoftc =3D 0x0,
>   if_l2com =3D 0x0,
>   if_dname =3D 0xffffffff81492f70 "ng",
>   if_dunit =3D 0,
>   if_index =3D 14,
>   if_idxgen =3D 2,
>   if_xname =3D "pppoe0\000\000\000\000\000\000\000\000\000",
>   if_description =3D 0xfffff8003a5f83d0 "WAN",
>   if_flags =3D 2132112,
>   if_drv_flags =3D 0,
>   if_capabilities =3D 0,
>   if_capabilities2 =3D 0,
> =E2=80=A6
>   if_afdata =3D {0x0 <repeats 44 times>},
> =E2=80=A6
>   if_output =3D 0xffffffff80e29c60 <ifdead_output>,
>   if_input =3D 0xffffffff80e29c80 <ifdead_input>,
>   if_bridge_input =3D 0x0,
>   if_bridge_output =3D 0x0,
>   if_bridge_linkstate =3D 0x0,
>   if_start =3D 0xffffffff80e29c90 <ifdead_start>,
>   if_ioctl =3D 0xffffffff80e29ca0 <ifdead_ioctl>,
> =E2=80=A6
> My understanding is that the fib table should get updated whenever we =
change the routing table (such as during interface cleanup in =
if_detach_internal()). Some quick experimentation with epair and dtrace =
also shows:
>=20
>  20  20388           sync_algo_end_cb:entry Stage 1
>               kernel`setup_fd_instance+0x41f
>               kernel`rebuild_fd_flm+0x99
>               kernel`rebuild_fd+0x136
>               kernel`rib_notify+0x50
>               kernel`rt_delete_conditional+0xf1
>               kernel`rib_del_route+0x1fc
>               kernel`rib_handle_ifaddr_info+0xd9
>               kernel`nd6_prefix_offlink+0x1ce
>               kernel`nd6_prefix_del+0x94
>               kernel`if_purgeaddrs+0x148
>               kernel`if_detach_internal+0x1e8
>               kernel`if_detach+0x71
>               if_epair.ko`epair_clone_destroy+0x62
>               kernel`if_clone_destroyif_flags+0x6a
>               kernel`if_clone_destroy+0x100
>               kernel`ifioctl+0x8a5
>               kernel`kern_ioctl+0x286
>               kernel`sys_ioctl+0x152
>               kernel`amd64_syscall+0x153
>               kernel`0xffffffff8102315b
> In other words, when we delete the interface if_detach_internal() =
purges the interface addresses, which ends up rebuilding the fib =
(rebuild_fd()) via rib_del_route().
> That ought to ensure that we cannot end up finding this struct ifnet =
through fib6_lookup(), as the purging of the addresses (and thus the =
rebuilding of the fib) is done before we if_domdetach() at the end of =
if_detach_internal(), and the NULL afdata[AF_INET6] demonstrates that =
we=E2=80=99ve gotten there.
>=20
>=20

By intuition, fib6_lookup() should not return **INVALID** next hop (with =
detaching interfaces), unless explicitly requested.
> We=E2=80=99ve also gone through if_free(), as the ifindex_table no =
longer contains the struct ifnet pointer for the relevant interface.
> We appear to have not yet called if_free_deferred() (and indeed, =
ifp->if_refcount is 4, so we wouldn=E2=80=99t have called that yet).
>=20
> I=E2=80=99m confused as to how this can happen, and would appreciate =
hints.
>=20
>=20

I believe Alexander has insight on this.
> Thanks,
> Kristof
>=20




--Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; line-break: after-white-space;" class=3D""><br =
class=3D""><div><br class=3D""><blockquote type=3D"cite" class=3D""><div =
class=3D"">On Oct 25, 2023, at 11:27 PM, Kristof Provost &lt;<a =
href=3D"mailto:kp@FreeBSD.org" class=3D"">kp@FreeBSD.org</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D"">


<meta http-equiv=3D"Content-Type" content=3D"text/xhtml; charset=3Dutf-8" =
class=3D"">

<div class=3D""><div style=3D"font-family: sans-serif;" class=3D""><div =
class=3D"markdown" style=3D"white-space: normal;"><p dir=3D"auto" =
class=3D"">Hi,</p><p dir=3D"auto" class=3D"">Several pfSense users =
report IPv6-related panics when an interface is deleted.<br class=3D"">
The relevant bug reports are <a =
href=3D"https://redmine.pfsense.org/issues/14164" =
class=3D"">https://redmine.pfsense.org/issues/14164</a>; and <a =
href=3D"https://redmine.pfsense.org/issues/14431" =
class=3D"">https://redmine.pfsense.org/issues/14431</a>.<br class=3D"">
The latest report is for a build that includes commits up to =
1a18383a52bc373e316d224cef1298debf6f7e25 (=E2=80=9Clibcrypto: link =
engines and the legacy provider to libcrypto=E2=80=9D, September =
15th).</p><p dir=3D"auto" class=3D"">I believe all reports are for users =
running PPPoE, via netgraph, but that might be coincidental, as that=E2=80=
=99s the most likely way for interfaces to be destroyed (when PPP =
disconnects and reconnects).</p><p dir=3D"auto" class=3D"">There are a =
few different backtraces, but they appear to have the same root cause, =
so I=E2=80=99ll focus on one of them:</p>
<pre style=3D"margin-left: 15px; margin-right: 15px; padding: 5px; =
border: thin solid gray; overflow-x: auto; max-width: 90vw; =
background-color: #E4E4E4;" class=3D""><code style=3D"padding: 0 0.25em; =
background-color: #E4E4E4;" class=3D"">db:1:pfs&gt; bt
Tracing pid 2 tid 100041 td 0xfffffe0085264560
kdb_enter() at kdb_enter+0x32/frame 0xfffffe00850ad910
vpanic() at vpanic+0x183/frame 0xfffffe00850ad960
panic() at panic+0x43/frame 0xfffffe00850ad9c0
trap_fatal() at trap_fatal+0x409/frame 0xfffffe00850ada20
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00850ada80
calltrap() at calltrap+0x8/frame 0xfffffe00850ada80
--- trap 0xc, rip =3D 0xffffffff80f5a036, rsp =3D 0xfffffe00850adb50, =
rbp =3D 0xfffffe00850adb80 ---
in6_selecthlim() at in6_selecthlim+0x96/frame 0xfffffe00850adb80
tcp_default_output() at tcp_default_output+0x1ded/frame =
0xfffffe00850add70
tcp_timer_rexmt() at tcp_timer_rexmt+0x514/frame 0xfffffe00850addd0
tcp_timer_enter() at tcp_timer_enter+0x102/frame 0xfffffe00850ade10
softclock_call_cc() at softclock_call_cc+0x13c/frame 0xfffffe00850adec0
softclock_thread() at softclock_thread+0xe9/frame 0xfffffe00850adef0
fork_exit() at fork_exit+0x7d/frame 0xfffffe00850adf30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00850adf30
--- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---
</code></pre><p dir=3D"auto" class=3D"">This happens in the TCP output =
path, where we look up the hop limit for a specific destination. I=E2=80=99=
ve obtained a core dump for such a crash, and I believe the panic =
happens on line <a =
href=3D"https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861" =
class=3D"">https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861</=
a></p><p dir=3D"auto" class=3D"">The call in tcp_default_output() is =
<code style=3D"padding: 0 0.25em; background-color: #E4E4E4;" =
class=3D"">in6_selecthlim(int, NULL);</code>, so we don=E2=80=99t get an =
ifp from the caller, but instead perform a route lookup, and try to =
obtain the hop limit through <code style=3D"padding: 0 0.25em; =
background-color: #E4E4E4;" class=3D"">ND_IFINFO(nh-&gt;nh_ifp)</code>. =
This panics because the afdata[AF_INET6] pointer is NULL. The core dump =
shows a deleted structure ifnet:</p><div class=3D""><br =
class=3D""></div></div></div></div></div></blockquote><div><br =
class=3D""></div>`egrep -r 'if_afdata\[AF_INET6\]\s*[!=3D]=3D\s*NULL' =
sys/netinet6'` shows there're many places do the NULL check. I think we =
can do it in in6_selecthlim() as a =
workaround.</div><div>&nbsp;</div><div><blockquote type=3D"cite" =
class=3D""><div class=3D""><div class=3D""><div style=3D"font-family: =
sans-serif;" class=3D""><div class=3D"markdown" style=3D"white-space: =
normal;">
<pre style=3D"margin-left: 15px; margin-right: 15px; padding: 5px; =
border: thin solid gray; overflow-x: auto; max-width: 90vw; =
background-color: #E4E4E4;" class=3D""><code style=3D"padding: 0 0.25em; =
background-color: #E4E4E4;" class=3D"">(kgdb) p *(struct ifnet =
*)0xfffff80203712800
$3 =3D {
  if_link =3D {
    cstqe_next =3D 0x0
  },
  if_clones =3D {
    le_next =3D 0x0,
    le_prev =3D 0x0
  },
  if_groups =3D {
    cstqh_first =3D 0x0,
    cstqh_last =3D 0xfffff80203712818
  },
  if_alloctype =3D 53 '5',
  if_numa_domain =3D 255 '\377',
  if_softc =3D 0xfffff80103447a00,
  if_llsoftc =3D 0x0,
  if_l2com =3D 0x0,
  if_dname =3D 0xffffffff81492f70 "ng",
  if_dunit =3D 0,
  if_index =3D 14,
  if_idxgen =3D 2,
  if_xname =3D "pppoe0\000\000\000\000\000\000\000\000\000",
  if_description =3D 0xfffff8003a5f83d0 "WAN",
  if_flags =3D 2132112,
  if_drv_flags =3D 0,
  if_capabilities =3D 0,
  if_capabilities2 =3D 0,
=E2=80=A6
  if_afdata =3D {0x0 &lt;repeats 44 times&gt;},
=E2=80=A6
  if_output =3D 0xffffffff80e29c60 &lt;ifdead_output&gt;,
  if_input =3D 0xffffffff80e29c80 &lt;ifdead_input&gt;,
  if_bridge_input =3D 0x0,
  if_bridge_output =3D 0x0,
  if_bridge_linkstate =3D 0x0,
  if_start =3D 0xffffffff80e29c90 &lt;ifdead_start&gt;,
  if_ioctl =3D 0xffffffff80e29ca0 &lt;ifdead_ioctl&gt;,
=E2=80=A6
</code></pre><p dir=3D"auto" class=3D"">My understanding is that the fib =
table should get updated whenever we change the routing table (such as =
during interface cleanup in <code style=3D"padding: 0 0.25em; =
background-color: #E4E4E4;" class=3D"">if_detach_internal()</code>). =
Some quick experimentation with epair and dtrace also shows:</p>
<pre style=3D"margin-left: 15px; margin-right: 15px; padding: 5px; =
border: thin solid gray; overflow-x: auto; max-width: 90vw; =
background-color: #E4E4E4;" class=3D""><code style=3D"padding: 0 0.25em; =
background-color: #E4E4E4;" class=3D""> 20  20388           =
sync_algo_end_cb:entry Stage 1
              kernel`setup_fd_instance+0x41f
              kernel`rebuild_fd_flm+0x99
              kernel`rebuild_fd+0x136
              kernel`rib_notify+0x50
              kernel`rt_delete_conditional+0xf1
              kernel`rib_del_route+0x1fc
              kernel`rib_handle_ifaddr_info+0xd9
              kernel`nd6_prefix_offlink+0x1ce
              kernel`nd6_prefix_del+0x94
              kernel`if_purgeaddrs+0x148
              kernel`if_detach_internal+0x1e8
              kernel`if_detach+0x71
              if_epair.ko`epair_clone_destroy+0x62
              kernel`if_clone_destroyif_flags+0x6a
              kernel`if_clone_destroy+0x100
              kernel`ifioctl+0x8a5
              kernel`kern_ioctl+0x286
              kernel`sys_ioctl+0x152
              kernel`amd64_syscall+0x153
              kernel`0xffffffff8102315b
</code></pre><p dir=3D"auto" class=3D"">In other words, when we delete =
the interface <code style=3D"padding: 0 0.25em; background-color: =
#E4E4E4;" class=3D"">if_detach_internal()</code> purges the interface =
addresses, which ends up rebuilding the fib (<code style=3D"padding: 0 =
0.25em; background-color: #E4E4E4;" class=3D"">rebuild_fd()</code>) via =
<code style=3D"padding: 0 0.25em; background-color: #E4E4E4;" =
class=3D"">rib_del_route()</code>.<br class=3D"">
That ought to ensure that we cannot end up finding this struct ifnet =
through <code style=3D"padding: 0 0.25em; background-color: #E4E4E4;" =
class=3D"">fib6_lookup()</code>, as the purging of the addresses (and =
thus the rebuilding of the fib) is done before we <code style=3D"padding: =
0 0.25em; background-color: #E4E4E4;" class=3D"">if_domdetach()</code> =
at the end of <code style=3D"padding: 0 0.25em; background-color: =
#E4E4E4;" class=3D"">if_detach_internal()</code>, and the NULL =
afdata[AF_INET6] demonstrates that we=E2=80=99ve gotten there.</p><div =
class=3D""><br =
class=3D""></div></div></div></div></div></blockquote><div><br =
class=3D""></div>By intuition, fib6_lookup() should not return =
**INVALID** next hop (with detaching interfaces), unless explicitly =
requested.<br class=3D""><blockquote type=3D"cite" class=3D""><div =
class=3D""><div class=3D""><div style=3D"font-family: sans-serif;" =
class=3D""><div class=3D"markdown" style=3D"white-space: normal;"><p =
dir=3D"auto" class=3D"">We=E2=80=99ve also gone through <code =
style=3D"padding: 0 0.25em; background-color: #E4E4E4;" =
class=3D"">if_free()</code>, as the ifindex_table no longer contains the =
struct ifnet pointer for the relevant interface.<br class=3D"">
We appear to have not yet called <code style=3D"padding: 0 0.25em; =
background-color: #E4E4E4;" class=3D"">if_free_deferred()</code> (and =
indeed, ifp-&gt;if_refcount is 4, so we wouldn=E2=80=99t have called =
that yet).</p><p dir=3D"auto" class=3D"">I=E2=80=99m confused as to how =
this can happen, and would appreciate hints.</p><div class=3D""><br =
class=3D""></div></div></div></div></div></blockquote><div><br =
class=3D""></div>I believe Alexander has insight on this.<br =
class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div =
class=3D""><div style=3D"font-family: sans-serif;" class=3D""><div =
class=3D"markdown" style=3D"white-space: normal;"><p dir=3D"auto" =
class=3D"">Thanks,<br class=3D"">
Kristof</p>

</div>
</div>
</div>


</div></blockquote></div><br class=3D""><div class=3D"">
<div><br class=3D""></div>

</div>
<br class=3D""></body></html>=

--Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3A4AA88F-E352-46DC-81DB-7408CD0A4D77>