From nobody Thu Oct 26 01:49:58 2023 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SG7zD64tJz4yKtc for ; Thu, 26 Oct 2023 01:50:16 +0000 (UTC) (envelope-from zlei@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SG7zD55LHz4XB6; Thu, 26 Oct 2023 01:50:16 +0000 (UTC) (envelope-from zlei@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1698285016; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=wwLZ4/VyDDt+XQFbC/vqd6a+j9iov6SNiJV4YtxOcl8=; b=QoqQWEmwzvVhvCZTRFDUwNSgZVC6rNaq/Rpw749p7qDIQu55+qsw7mOQ5yj8qQnU1IffIg Odrs+yzXDOSte5RY+xLv81h4yRakYQQYiTaxRNSF39YxKuzXnAn8eqRzhMqOAgzt7RgyCG Eemv2zLxvkQ2GnnmqAF3/u7o3yxn/I5ZJMlgnj/X8BL+LlBl7FooasHzCvunTFO3SqZA0F S+gqPmbrb2inrCdXshX/udeMmO5ewmMCBc7u9xVux7qLX/7mu/epeqftr5HXGoNqn/LPNo OI0Hss1/2mooKSKPMIx7Vl+6J/CSwkkIqwYpN1GXSThN0aYudieQAcy/AXXwag== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1698285016; a=rsa-sha256; cv=none; b=T1noe5yMOC/l2mO+0eVLkJZO5c7v3Xq/l2L0ns0NseRjXtWM8Mjjwmx4RZQ+9LX2gwgbVS HNPSJe13K5XGCsHHHjM+iqRwEznlaWqE0WlZ+DJH1slyDxx9tUKr39dJnZNG+r5QBhuMgT g62wM3J6ljkBwqhBxMBzUowxirNAw9Q3IqOhijz4n0OxHrr/5F0c8JHako0GB83UjaCqM0 ciwWo0P8IGC5p0MGzzOjPYqzQtveyH+TM0OrrI+miKS+FcBvZcGY5PZJHM9uJ8jcLZqI4F 0Bj3nOHse4t5VVLuOXHBc/UU1epP+errONZ1ixwLBvklr/xcQzp7FpxnA4QQgQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1698285016; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=wwLZ4/VyDDt+XQFbC/vqd6a+j9iov6SNiJV4YtxOcl8=; b=ZyYZXa3+9x+mTWlmFqzyidBJQ80sOz8dDDE6PRwHDVQ6bIfu/CykGWBhB5vtU1BOmCPUYa vzJYJ+61UT8cytrC5y65wYgwhqa0KyCf2HbXUeYTaIRB3HDgvIdwR+eBkdZyloSvCAFuWW ATlO70k9PvPMRMSF5kxXx9pIUOnc8Q3vO+rMVAFKZ+PK6m9aU4xuAQVgK4GNud64MG4W0s pNYewyRC8vq4r0ZD98HJpzuhuRcEvfKwqvgDW3SFMznYPullu0qRPVPOTn8GTIu6mJ9Ahz Rz9XpJE05ecjhdFnqTKx2gglnEQ+Vj4J2+FMCTfi56C87ZH38UaV1nqVjuCtvQ== Received: from smtpclient.apple (ns1.oxydns.net [45.32.91.63]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: zlei/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4SG7zC2JKkz157f; Thu, 26 Oct 2023 01:50:15 +0000 (UTC) (envelope-from zlei@FreeBSD.org) From: Zhenlei Huang Message-Id: <3A4AA88F-E352-46DC-81DB-7408CD0A4D77@FreeBSD.org> Content-Type: multipart/alternative; boundary="Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544" List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.4\)) Subject: Re: fib6_lookup() returning deleted struct ifnet Date: Thu, 26 Oct 2023 09:49:58 +0800 In-Reply-To: Cc: FreeBSD Net , Alexander Chernikov To: Kristof Provost References: X-Mailer: Apple Mail (2.3696.120.41.1.4) --Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Oct 25, 2023, at 11:27 PM, Kristof Provost wrote: >=20 > Hi, >=20 > Several pfSense users report IPv6-related panics when an interface is = deleted. > The relevant bug reports are https://redmine.pfsense.org/issues/14164 = and = https://redmine.pfsense.org/issues/14431 = . > The latest report is for a build that includes commits up to = 1a18383a52bc373e316d224cef1298debf6f7e25 (=E2=80=9Clibcrypto: link = engines and the legacy provider to libcrypto=E2=80=9D, September 15th). >=20 > I believe all reports are for users running PPPoE, via netgraph, but = that might be coincidental, as that=E2=80=99s the most likely way for = interfaces to be destroyed (when PPP disconnects and reconnects). >=20 > There are a few different backtraces, but they appear to have the same = root cause, so I=E2=80=99ll focus on one of them: >=20 > db:1:pfs> bt > Tracing pid 2 tid 100041 td 0xfffffe0085264560 > kdb_enter() at kdb_enter+0x32/frame 0xfffffe00850ad910 > vpanic() at vpanic+0x183/frame 0xfffffe00850ad960 > panic() at panic+0x43/frame 0xfffffe00850ad9c0 > trap_fatal() at trap_fatal+0x409/frame 0xfffffe00850ada20 > trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00850ada80 > calltrap() at calltrap+0x8/frame 0xfffffe00850ada80 > --- trap 0xc, rip =3D 0xffffffff80f5a036, rsp =3D 0xfffffe00850adb50, = rbp =3D 0xfffffe00850adb80 --- > in6_selecthlim() at in6_selecthlim+0x96/frame 0xfffffe00850adb80 > tcp_default_output() at tcp_default_output+0x1ded/frame = 0xfffffe00850add70 > tcp_timer_rexmt() at tcp_timer_rexmt+0x514/frame 0xfffffe00850addd0 > tcp_timer_enter() at tcp_timer_enter+0x102/frame 0xfffffe00850ade10 > softclock_call_cc() at softclock_call_cc+0x13c/frame = 0xfffffe00850adec0 > softclock_thread() at softclock_thread+0xe9/frame 0xfffffe00850adef0 > fork_exit() at fork_exit+0x7d/frame 0xfffffe00850adf30 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00850adf30 > --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- > This happens in the TCP output path, where we look up the hop limit = for a specific destination. I=E2=80=99ve obtained a core dump for such a = crash, and I believe the panic happens on line = https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861 = > The call in tcp_default_output() is in6_selecthlim(int, NULL);, so we = don=E2=80=99t get an ifp from the caller, but instead perform a route = lookup, and try to obtain the hop limit through ND_IFINFO(nh->nh_ifp). = This panics because the afdata[AF_INET6] pointer is NULL. The core dump = shows a deleted structure ifnet: >=20 >=20 `egrep -r 'if_afdata\[AF_INET6\]\s*[!=3D]=3D\s*NULL' sys/netinet6'` = shows there're many places do the NULL check. I think we can do it in = in6_selecthlim() as a workaround. =20 > (kgdb) p *(struct ifnet *)0xfffff80203712800 > $3 =3D { > if_link =3D { > cstqe_next =3D 0x0 > }, > if_clones =3D { > le_next =3D 0x0, > le_prev =3D 0x0 > }, > if_groups =3D { > cstqh_first =3D 0x0, > cstqh_last =3D 0xfffff80203712818 > }, > if_alloctype =3D 53 '5', > if_numa_domain =3D 255 '\377', > if_softc =3D 0xfffff80103447a00, > if_llsoftc =3D 0x0, > if_l2com =3D 0x0, > if_dname =3D 0xffffffff81492f70 "ng", > if_dunit =3D 0, > if_index =3D 14, > if_idxgen =3D 2, > if_xname =3D "pppoe0\000\000\000\000\000\000\000\000\000", > if_description =3D 0xfffff8003a5f83d0 "WAN", > if_flags =3D 2132112, > if_drv_flags =3D 0, > if_capabilities =3D 0, > if_capabilities2 =3D 0, > =E2=80=A6 > if_afdata =3D {0x0 }, > =E2=80=A6 > if_output =3D 0xffffffff80e29c60 , > if_input =3D 0xffffffff80e29c80 , > if_bridge_input =3D 0x0, > if_bridge_output =3D 0x0, > if_bridge_linkstate =3D 0x0, > if_start =3D 0xffffffff80e29c90 , > if_ioctl =3D 0xffffffff80e29ca0 , > =E2=80=A6 > My understanding is that the fib table should get updated whenever we = change the routing table (such as during interface cleanup in = if_detach_internal()). Some quick experimentation with epair and dtrace = also shows: >=20 > 20 20388 sync_algo_end_cb:entry Stage 1 > kernel`setup_fd_instance+0x41f > kernel`rebuild_fd_flm+0x99 > kernel`rebuild_fd+0x136 > kernel`rib_notify+0x50 > kernel`rt_delete_conditional+0xf1 > kernel`rib_del_route+0x1fc > kernel`rib_handle_ifaddr_info+0xd9 > kernel`nd6_prefix_offlink+0x1ce > kernel`nd6_prefix_del+0x94 > kernel`if_purgeaddrs+0x148 > kernel`if_detach_internal+0x1e8 > kernel`if_detach+0x71 > if_epair.ko`epair_clone_destroy+0x62 > kernel`if_clone_destroyif_flags+0x6a > kernel`if_clone_destroy+0x100 > kernel`ifioctl+0x8a5 > kernel`kern_ioctl+0x286 > kernel`sys_ioctl+0x152 > kernel`amd64_syscall+0x153 > kernel`0xffffffff8102315b > In other words, when we delete the interface if_detach_internal() = purges the interface addresses, which ends up rebuilding the fib = (rebuild_fd()) via rib_del_route(). > That ought to ensure that we cannot end up finding this struct ifnet = through fib6_lookup(), as the purging of the addresses (and thus the = rebuilding of the fib) is done before we if_domdetach() at the end of = if_detach_internal(), and the NULL afdata[AF_INET6] demonstrates that = we=E2=80=99ve gotten there. >=20 >=20 By intuition, fib6_lookup() should not return **INVALID** next hop (with = detaching interfaces), unless explicitly requested. > We=E2=80=99ve also gone through if_free(), as the ifindex_table no = longer contains the struct ifnet pointer for the relevant interface. > We appear to have not yet called if_free_deferred() (and indeed, = ifp->if_refcount is 4, so we wouldn=E2=80=99t have called that yet). >=20 > I=E2=80=99m confused as to how this can happen, and would appreciate = hints. >=20 >=20 I believe Alexander has insight on this. > Thanks, > Kristof >=20 --Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Oct 25, 2023, at 11:27 PM, Kristof Provost <kp@FreeBSD.org> = wrote:

Hi,

Several pfSense users = report IPv6-related panics when an interface is deleted.
The relevant bug reports are https://redmine.pfsense.org/issues/14164 and https://redmine.pfsense.org/issues/14431.
The latest report is for a build that includes commits up to = 1a18383a52bc373e316d224cef1298debf6f7e25 (=E2=80=9Clibcrypto: link = engines and the legacy provider to libcrypto=E2=80=9D, September = 15th).

I believe all reports are for users = running PPPoE, via netgraph, but that might be coincidental, as that=E2=80= =99s the most likely way for interfaces to be destroyed (when PPP = disconnects and reconnects).

There are a = few different backtraces, but they appear to have the same root cause, = so I=E2=80=99ll focus on one of them:

db:1:pfs> bt
Tracing pid 2 tid 100041 td 0xfffffe0085264560
kdb_enter() at kdb_enter+0x32/frame 0xfffffe00850ad910
vpanic() at vpanic+0x183/frame 0xfffffe00850ad960
panic() at panic+0x43/frame 0xfffffe00850ad9c0
trap_fatal() at trap_fatal+0x409/frame 0xfffffe00850ada20
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00850ada80
calltrap() at calltrap+0x8/frame 0xfffffe00850ada80
--- trap 0xc, rip =3D 0xffffffff80f5a036, rsp =3D 0xfffffe00850adb50, =
rbp =3D 0xfffffe00850adb80 ---
in6_selecthlim() at in6_selecthlim+0x96/frame 0xfffffe00850adb80
tcp_default_output() at tcp_default_output+0x1ded/frame =
0xfffffe00850add70
tcp_timer_rexmt() at tcp_timer_rexmt+0x514/frame 0xfffffe00850addd0
tcp_timer_enter() at tcp_timer_enter+0x102/frame 0xfffffe00850ade10
softclock_call_cc() at softclock_call_cc+0x13c/frame 0xfffffe00850adec0
softclock_thread() at softclock_thread+0xe9/frame 0xfffffe00850adef0
fork_exit() at fork_exit+0x7d/frame 0xfffffe00850adf30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00850adf30
--- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---

This happens in the TCP output = path, where we look up the hop limit for a specific destination. I=E2=80=99= ve obtained a core dump for such a crash, and I believe the panic = happens on line https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861

The call in tcp_default_output() is = in6_selecthlim(int, NULL);, so we don=E2=80=99t get an = ifp from the caller, but instead perform a route lookup, and try to = obtain the hop limit through ND_IFINFO(nh->nh_ifp). = This panics because the afdata[AF_INET6] pointer is NULL. The core dump = shows a deleted structure ifnet:



`egrep -r 'if_afdata\[AF_INET6\]\s*[!=3D]=3D\s*NULL' = sys/netinet6'` shows there're many places do the NULL check. I think we = can do it in in6_selecthlim() as a = workaround.
(kgdb) p *(struct ifnet =
*)0xfffff80203712800
$3 =3D {
  if_link =3D {
    cstqe_next =3D 0x0
  },
  if_clones =3D {
    le_next =3D 0x0,
    le_prev =3D 0x0
  },
  if_groups =3D {
    cstqh_first =3D 0x0,
    cstqh_last =3D 0xfffff80203712818
  },
  if_alloctype =3D 53 '5',
  if_numa_domain =3D 255 '\377',
  if_softc =3D 0xfffff80103447a00,
  if_llsoftc =3D 0x0,
  if_l2com =3D 0x0,
  if_dname =3D 0xffffffff81492f70 "ng",
  if_dunit =3D 0,
  if_index =3D 14,
  if_idxgen =3D 2,
  if_xname =3D "pppoe0\000\000\000\000\000\000\000\000\000",
  if_description =3D 0xfffff8003a5f83d0 "WAN",
  if_flags =3D 2132112,
  if_drv_flags =3D 0,
  if_capabilities =3D 0,
  if_capabilities2 =3D 0,
=E2=80=A6
  if_afdata =3D {0x0 <repeats 44 times>},
=E2=80=A6
  if_output =3D 0xffffffff80e29c60 <ifdead_output>,
  if_input =3D 0xffffffff80e29c80 <ifdead_input>,
  if_bridge_input =3D 0x0,
  if_bridge_output =3D 0x0,
  if_bridge_linkstate =3D 0x0,
  if_start =3D 0xffffffff80e29c90 <ifdead_start>,
  if_ioctl =3D 0xffffffff80e29ca0 <ifdead_ioctl>,
=E2=80=A6

My understanding is that the fib = table should get updated whenever we change the routing table (such as = during interface cleanup in if_detach_internal()). = Some quick experimentation with epair and dtrace also shows:

 20  20388           =
sync_algo_end_cb:entry Stage 1
              kernel`setup_fd_instance+0x41f
              kernel`rebuild_fd_flm+0x99
              kernel`rebuild_fd+0x136
              kernel`rib_notify+0x50
              kernel`rt_delete_conditional+0xf1
              kernel`rib_del_route+0x1fc
              kernel`rib_handle_ifaddr_info+0xd9
              kernel`nd6_prefix_offlink+0x1ce
              kernel`nd6_prefix_del+0x94
              kernel`if_purgeaddrs+0x148
              kernel`if_detach_internal+0x1e8
              kernel`if_detach+0x71
              if_epair.ko`epair_clone_destroy+0x62
              kernel`if_clone_destroyif_flags+0x6a
              kernel`if_clone_destroy+0x100
              kernel`ifioctl+0x8a5
              kernel`kern_ioctl+0x286
              kernel`sys_ioctl+0x152
              kernel`amd64_syscall+0x153
              kernel`0xffffffff8102315b

In other words, when we delete = the interface if_detach_internal() purges the interface = addresses, which ends up rebuilding the fib (rebuild_fd()) via = rib_del_route().
That ought to ensure that we cannot end up finding this struct ifnet = through fib6_lookup(), as the purging of the addresses (and = thus the rebuilding of the fib) is done before we if_domdetach() = at the end of if_detach_internal(), and the NULL = afdata[AF_INET6] demonstrates that we=E2=80=99ve gotten there.



By intuition, fib6_lookup() should not return = **INVALID** next hop (with detaching interfaces), unless explicitly = requested.

We=E2=80=99ve also gone through if_free(), as the ifindex_table no longer contains the = struct ifnet pointer for the relevant interface.
We appear to have not yet called if_free_deferred() (and = indeed, ifp->if_refcount is 4, so we wouldn=E2=80=99t have called = that yet).

I=E2=80=99m confused as to how = this can happen, and would appreciate hints.



I believe Alexander has insight on this.

Thanks,
Kristof




= --Apple-Mail=_AA58D40C-EF67-4179-9921-3CD6A8D17544--