Date: Thu, 26 Oct 2023 09:49:58 +0800 From: Zhenlei Huang <zlei@FreeBSD.org> To: Kristof Provost <kp@FreeBSD.org> Cc: FreeBSD Net <freebsd-net@freebsd.org>, Alexander Chernikov <melifaro@freebsd.org> Subject: Re: fib6_lookup() returning deleted struct ifnet Message-ID: <3A4AA88F-E352-46DC-81DB-7408CD0A4D77@FreeBSD.org> In-Reply-To: <A5BC4385-AA39-4608-B1BA-0551AFBB49CD@FreeBSD.org> References: <A5BC4385-AA39-4608-B1BA-0551AFBB49CD@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Oct 25, 2023, at 11:27 PM, Kristof Provost <kp@FreeBSD.org> wrote: >=20 > Hi, >=20 > Several pfSense users report IPv6-related panics when an interface is = deleted. > The relevant bug reports are https://redmine.pfsense.org/issues/14164 = <https://redmine.pfsense.org/issues/14164> and = https://redmine.pfsense.org/issues/14431 = <https://redmine.pfsense.org/issues/14431>. > The latest report is for a build that includes commits up to = 1a18383a52bc373e316d224cef1298debf6f7e25 (=E2=80=9Clibcrypto: link = engines and the legacy provider to libcrypto=E2=80=9D, September 15th). >=20 > I believe all reports are for users running PPPoE, via netgraph, but = that might be coincidental, as that=E2=80=99s the most likely way for = interfaces to be destroyed (when PPP disconnects and reconnects). >=20 > There are a few different backtraces, but they appear to have the same = root cause, so I=E2=80=99ll focus on one of them: >=20 > db:1:pfs> bt > Tracing pid 2 tid 100041 td 0xfffffe0085264560 > kdb_enter() at kdb_enter+0x32/frame 0xfffffe00850ad910 > vpanic() at vpanic+0x183/frame 0xfffffe00850ad960 > panic() at panic+0x43/frame 0xfffffe00850ad9c0 > trap_fatal() at trap_fatal+0x409/frame 0xfffffe00850ada20 > trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00850ada80 > calltrap() at calltrap+0x8/frame 0xfffffe00850ada80 > --- trap 0xc, rip =3D 0xffffffff80f5a036, rsp =3D 0xfffffe00850adb50, = rbp =3D 0xfffffe00850adb80 --- > in6_selecthlim() at in6_selecthlim+0x96/frame 0xfffffe00850adb80 > tcp_default_output() at tcp_default_output+0x1ded/frame = 0xfffffe00850add70 > tcp_timer_rexmt() at tcp_timer_rexmt+0x514/frame 0xfffffe00850addd0 > tcp_timer_enter() at tcp_timer_enter+0x102/frame 0xfffffe00850ade10 > softclock_call_cc() at softclock_call_cc+0x13c/frame = 0xfffffe00850adec0 > softclock_thread() at softclock_thread+0xe9/frame 0xfffffe00850adef0 > fork_exit() at fork_exit+0x7d/frame 0xfffffe00850adf30 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00850adf30 > --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- > This happens in the TCP output path, where we look up the hop limit = for a specific destination. I=E2=80=99ve obtained a core dump for such a = crash, and I believe the panic happens on line = https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861 = <https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861> > The call in tcp_default_output() is in6_selecthlim(int, NULL);, so we = don=E2=80=99t get an ifp from the caller, but instead perform a route = lookup, and try to obtain the hop limit through ND_IFINFO(nh->nh_ifp). = This panics because the afdata[AF_INET6] pointer is NULL. The core dump = shows a deleted structure ifnet: >=20 >=20 `egrep -r 'if_afdata\[AF_INET6\]\s*[!=3D]=3D\s*NULL' sys/netinet6'` = shows there're many places do the NULL check. I think we can do it in = in6_selecthlim() as a workaround. =20 > (kgdb) p *(struct ifnet *)0xfffff80203712800 > $3 =3D { > if_link =3D { > cstqe_next =3D 0x0 > }, > if_clones =3D { > le_next =3D 0x0, > le_prev =3D 0x0 > }, > if_groups =3D { > cstqh_first =3D 0x0, > cstqh_last =3D 0xfffff80203712818 > }, > if_alloctype =3D 53 '5', > if_numa_domain =3D 255 '\377', > if_softc =3D 0xfffff80103447a00, > if_llsoftc =3D 0x0, > if_l2com =3D 0x0, > if_dname =3D 0xffffffff81492f70 "ng", > if_dunit =3D 0, > if_index =3D 14, > if_idxgen =3D 2, > if_xname =3D "pppoe0\000\000\000\000\000\000\000\000\000", > if_description =3D 0xfffff8003a5f83d0 "WAN", > if_flags =3D 2132112, > if_drv_flags =3D 0, > if_capabilities =3D 0, > if_capabilities2 =3D 0, > =E2=80=A6 > if_afdata =3D {0x0 <repeats 44 times>}, > =E2=80=A6 > if_output =3D 0xffffffff80e29c60 <ifdead_output>, > if_input =3D 0xffffffff80e29c80 <ifdead_input>, > if_bridge_input =3D 0x0, > if_bridge_output =3D 0x0, > if_bridge_linkstate =3D 0x0, > if_start =3D 0xffffffff80e29c90 <ifdead_start>, > if_ioctl =3D 0xffffffff80e29ca0 <ifdead_ioctl>, > =E2=80=A6 > My understanding is that the fib table should get updated whenever we = change the routing table (such as during interface cleanup in = if_detach_internal()). Some quick experimentation with epair and dtrace = also shows: >=20 > 20 20388 sync_algo_end_cb:entry Stage 1 > kernel`setup_fd_instance+0x41f > kernel`rebuild_fd_flm+0x99 > kernel`rebuild_fd+0x136 > kernel`rib_notify+0x50 > kernel`rt_delete_conditional+0xf1 > kernel`rib_del_route+0x1fc > kernel`rib_handle_ifaddr_info+0xd9 > kernel`nd6_prefix_offlink+0x1ce > kernel`nd6_prefix_del+0x94 > kernel`if_purgeaddrs+0x148 > kernel`if_detach_internal+0x1e8 > kernel`if_detach+0x71 > if_epair.ko`epair_clone_destroy+0x62 > kernel`if_clone_destroyif_flags+0x6a > kernel`if_clone_destroy+0x100 > kernel`ifioctl+0x8a5 > kernel`kern_ioctl+0x286 > kernel`sys_ioctl+0x152 > kernel`amd64_syscall+0x153 > kernel`0xffffffff8102315b > In other words, when we delete the interface if_detach_internal() = purges the interface addresses, which ends up rebuilding the fib = (rebuild_fd()) via rib_del_route(). > That ought to ensure that we cannot end up finding this struct ifnet = through fib6_lookup(), as the purging of the addresses (and thus the = rebuilding of the fib) is done before we if_domdetach() at the end of = if_detach_internal(), and the NULL afdata[AF_INET6] demonstrates that = we=E2=80=99ve gotten there. >=20 >=20 By intuition, fib6_lookup() should not return **INVALID** next hop (with = detaching interfaces), unless explicitly requested. > We=E2=80=99ve also gone through if_free(), as the ifindex_table no = longer contains the struct ifnet pointer for the relevant interface. > We appear to have not yet called if_free_deferred() (and indeed, = ifp->if_refcount is 4, so we wouldn=E2=80=99t have called that yet). >=20 > I=E2=80=99m confused as to how this can happen, and would appreciate = hints. >=20 >=20 I believe Alexander has insight on this. > Thanks, > Kristof >=20 --Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;" class=3D""><br = class=3D""><div><br class=3D""><blockquote type=3D"cite" class=3D""><div = class=3D"">On Oct 25, 2023, at 11:27 PM, Kristof Provost <<a = href=3D"mailto:kp@FreeBSD.org" class=3D"">kp@FreeBSD.org</a>> = wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""> <meta http-equiv=3D"Content-Type" content=3D"text/xhtml; charset=3Dutf-8" = class=3D""> <div class=3D""><div style=3D"font-family: sans-serif;" class=3D""><div = class=3D"markdown" style=3D"white-space: normal;"><p dir=3D"auto" = class=3D"">Hi,</p><p dir=3D"auto" class=3D"">Several pfSense users = report IPv6-related panics when an interface is deleted.<br class=3D""> The relevant bug reports are <a = href=3D"https://redmine.pfsense.org/issues/14164" = class=3D"">https://redmine.pfsense.org/issues/14164</a> and <a = href=3D"https://redmine.pfsense.org/issues/14431" = class=3D"">https://redmine.pfsense.org/issues/14431</a>.<br class=3D""> The latest report is for a build that includes commits up to = 1a18383a52bc373e316d224cef1298debf6f7e25 (=E2=80=9Clibcrypto: link = engines and the legacy provider to libcrypto=E2=80=9D, September = 15th).</p><p dir=3D"auto" class=3D"">I believe all reports are for users = running PPPoE, via netgraph, but that might be coincidental, as that=E2=80= =99s the most likely way for interfaces to be destroyed (when PPP = disconnects and reconnects).</p><p dir=3D"auto" class=3D"">There are a = few different backtraces, but they appear to have the same root cause, = so I=E2=80=99ll focus on one of them:</p> <pre style=3D"margin-left: 15px; margin-right: 15px; padding: 5px; = border: thin solid gray; overflow-x: auto; max-width: 90vw; = background-color: #E4E4E4;" class=3D""><code style=3D"padding: 0 0.25em; = background-color: #E4E4E4;" class=3D"">db:1:pfs> bt Tracing pid 2 tid 100041 td 0xfffffe0085264560 kdb_enter() at kdb_enter+0x32/frame 0xfffffe00850ad910 vpanic() at vpanic+0x183/frame 0xfffffe00850ad960 panic() at panic+0x43/frame 0xfffffe00850ad9c0 trap_fatal() at trap_fatal+0x409/frame 0xfffffe00850ada20 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00850ada80 calltrap() at calltrap+0x8/frame 0xfffffe00850ada80 --- trap 0xc, rip =3D 0xffffffff80f5a036, rsp =3D 0xfffffe00850adb50, = rbp =3D 0xfffffe00850adb80 --- in6_selecthlim() at in6_selecthlim+0x96/frame 0xfffffe00850adb80 tcp_default_output() at tcp_default_output+0x1ded/frame = 0xfffffe00850add70 tcp_timer_rexmt() at tcp_timer_rexmt+0x514/frame 0xfffffe00850addd0 tcp_timer_enter() at tcp_timer_enter+0x102/frame 0xfffffe00850ade10 softclock_call_cc() at softclock_call_cc+0x13c/frame 0xfffffe00850adec0 softclock_thread() at softclock_thread+0xe9/frame 0xfffffe00850adef0 fork_exit() at fork_exit+0x7d/frame 0xfffffe00850adf30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00850adf30 --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- </code></pre><p dir=3D"auto" class=3D"">This happens in the TCP output = path, where we look up the hop limit for a specific destination. I=E2=80=99= ve obtained a core dump for such a crash, and I believe the panic = happens on line <a = href=3D"https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861" = class=3D"">https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861</= a></p><p dir=3D"auto" class=3D"">The call in tcp_default_output() is = <code style=3D"padding: 0 0.25em; background-color: #E4E4E4;" = class=3D"">in6_selecthlim(int, NULL);</code>, so we don=E2=80=99t get an = ifp from the caller, but instead perform a route lookup, and try to = obtain the hop limit through <code style=3D"padding: 0 0.25em; = background-color: #E4E4E4;" class=3D"">ND_IFINFO(nh->nh_ifp)</code>. = This panics because the afdata[AF_INET6] pointer is NULL. The core dump = shows a deleted structure ifnet:</p><div class=3D""><br = class=3D""></div></div></div></div></div></blockquote><div><br = class=3D""></div>`egrep -r 'if_afdata\[AF_INET6\]\s*[!=3D]=3D\s*NULL' = sys/netinet6'` shows there're many places do the NULL check. I think we = can do it in in6_selecthlim() as a = workaround.</div><div> </div><div><blockquote type=3D"cite" = class=3D""><div class=3D""><div class=3D""><div style=3D"font-family: = sans-serif;" class=3D""><div class=3D"markdown" style=3D"white-space: = normal;"> <pre style=3D"margin-left: 15px; margin-right: 15px; padding: 5px; = border: thin solid gray; overflow-x: auto; max-width: 90vw; = background-color: #E4E4E4;" class=3D""><code style=3D"padding: 0 0.25em; = background-color: #E4E4E4;" class=3D"">(kgdb) p *(struct ifnet = *)0xfffff80203712800 $3 =3D { if_link =3D { cstqe_next =3D 0x0 }, if_clones =3D { le_next =3D 0x0, le_prev =3D 0x0 }, if_groups =3D { cstqh_first =3D 0x0, cstqh_last =3D 0xfffff80203712818 }, if_alloctype =3D 53 '5', if_numa_domain =3D 255 '\377', if_softc =3D 0xfffff80103447a00, if_llsoftc =3D 0x0, if_l2com =3D 0x0, if_dname =3D 0xffffffff81492f70 "ng", if_dunit =3D 0, if_index =3D 14, if_idxgen =3D 2, if_xname =3D "pppoe0\000\000\000\000\000\000\000\000\000", if_description =3D 0xfffff8003a5f83d0 "WAN", if_flags =3D 2132112, if_drv_flags =3D 0, if_capabilities =3D 0, if_capabilities2 =3D 0, =E2=80=A6 if_afdata =3D {0x0 <repeats 44 times>}, =E2=80=A6 if_output =3D 0xffffffff80e29c60 <ifdead_output>, if_input =3D 0xffffffff80e29c80 <ifdead_input>, if_bridge_input =3D 0x0, if_bridge_output =3D 0x0, if_bridge_linkstate =3D 0x0, if_start =3D 0xffffffff80e29c90 <ifdead_start>, if_ioctl =3D 0xffffffff80e29ca0 <ifdead_ioctl>, =E2=80=A6 </code></pre><p dir=3D"auto" class=3D"">My understanding is that the fib = table should get updated whenever we change the routing table (such as = during interface cleanup in <code style=3D"padding: 0 0.25em; = background-color: #E4E4E4;" class=3D"">if_detach_internal()</code>). = Some quick experimentation with epair and dtrace also shows:</p> <pre style=3D"margin-left: 15px; margin-right: 15px; padding: 5px; = border: thin solid gray; overflow-x: auto; max-width: 90vw; = background-color: #E4E4E4;" class=3D""><code style=3D"padding: 0 0.25em; = background-color: #E4E4E4;" class=3D""> 20 20388 = sync_algo_end_cb:entry Stage 1 kernel`setup_fd_instance+0x41f kernel`rebuild_fd_flm+0x99 kernel`rebuild_fd+0x136 kernel`rib_notify+0x50 kernel`rt_delete_conditional+0xf1 kernel`rib_del_route+0x1fc kernel`rib_handle_ifaddr_info+0xd9 kernel`nd6_prefix_offlink+0x1ce kernel`nd6_prefix_del+0x94 kernel`if_purgeaddrs+0x148 kernel`if_detach_internal+0x1e8 kernel`if_detach+0x71 if_epair.ko`epair_clone_destroy+0x62 kernel`if_clone_destroyif_flags+0x6a kernel`if_clone_destroy+0x100 kernel`ifioctl+0x8a5 kernel`kern_ioctl+0x286 kernel`sys_ioctl+0x152 kernel`amd64_syscall+0x153 kernel`0xffffffff8102315b </code></pre><p dir=3D"auto" class=3D"">In other words, when we delete = the interface <code style=3D"padding: 0 0.25em; background-color: = #E4E4E4;" class=3D"">if_detach_internal()</code> purges the interface = addresses, which ends up rebuilding the fib (<code style=3D"padding: 0 = 0.25em; background-color: #E4E4E4;" class=3D"">rebuild_fd()</code>) via = <code style=3D"padding: 0 0.25em; background-color: #E4E4E4;" = class=3D"">rib_del_route()</code>.<br class=3D""> That ought to ensure that we cannot end up finding this struct ifnet = through <code style=3D"padding: 0 0.25em; background-color: #E4E4E4;" = class=3D"">fib6_lookup()</code>, as the purging of the addresses (and = thus the rebuilding of the fib) is done before we <code style=3D"padding: = 0 0.25em; background-color: #E4E4E4;" class=3D"">if_domdetach()</code> = at the end of <code style=3D"padding: 0 0.25em; background-color: = #E4E4E4;" class=3D"">if_detach_internal()</code>, and the NULL = afdata[AF_INET6] demonstrates that we=E2=80=99ve gotten there.</p><div = class=3D""><br = class=3D""></div></div></div></div></div></blockquote><div><br = class=3D""></div>By intuition, fib6_lookup() should not return = **INVALID** next hop (with detaching interfaces), unless explicitly = requested.<br class=3D""><blockquote type=3D"cite" class=3D""><div = class=3D""><div class=3D""><div style=3D"font-family: sans-serif;" = class=3D""><div class=3D"markdown" style=3D"white-space: normal;"><p = dir=3D"auto" class=3D"">We=E2=80=99ve also gone through <code = style=3D"padding: 0 0.25em; background-color: #E4E4E4;" = class=3D"">if_free()</code>, as the ifindex_table no longer contains the = struct ifnet pointer for the relevant interface.<br class=3D""> We appear to have not yet called <code style=3D"padding: 0 0.25em; = background-color: #E4E4E4;" class=3D"">if_free_deferred()</code> (and = indeed, ifp->if_refcount is 4, so we wouldn=E2=80=99t have called = that yet).</p><p dir=3D"auto" class=3D"">I=E2=80=99m confused as to how = this can happen, and would appreciate hints.</p><div class=3D""><br = class=3D""></div></div></div></div></div></blockquote><div><br = class=3D""></div>I believe Alexander has insight on this.<br = class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div = class=3D""><div style=3D"font-family: sans-serif;" class=3D""><div = class=3D"markdown" style=3D"white-space: normal;"><p dir=3D"auto" = class=3D"">Thanks,<br class=3D""> Kristof</p> </div> </div> </div> </div></blockquote></div><br class=3D""><div class=3D""> <div><br class=3D""></div> </div> <br class=3D""></body></html>= --Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3A4AA88F-E352-46DC-81DB-7408CD0A4D77>