Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Oct 2019 21:16:22 +0000
From:      "Keller, Jacob E" <jacob.e.keller@intel.com>
To:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Cc:        "shurd@llnw.com" <shurd@llnw.com>, "Joyner, Eric" <eric.joyner@intel.com>,  John Baldwin <jhb@freebsd.org>
Subject:   panic on invalid ifp pointer in iflib drivers
Message-ID:  <02874ECE860811409154E81DA85FBB589692E0D4@ORSMSX121.amr.corp.intel.com>

next in thread | raw e-mail | index | archive | help
Hi,

I'm investigating an issue on the iflib ixl driver in 11.3-RELEASE as well =
as 12-RELEASE. We found a panic in that occurs if SCTP/IPv6 traffic is bein=
g transmitted while the device is detached:

Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; apic id =3D 00
fault virtual address   =3D 0xfffffe0000411e38
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80c84700
stack pointer           =3D 0x28:0xfffffe2f4351b600
frame pointer           =3D 0x28:0xfffffe2f4351b650
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 12 (swi4: clock (0))
trap number             =3D 12
panic: page fault
cpuid =3D 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe2f4351b=
2c0
vpanic() at vpanic+0x17e/frame 0xfffffe2f4351b320
panic() at panic+0x43/frame 0xfffffe2f4351b380
trap_fatal() at trap_fatal+0x369/frame 0xfffffe2f4351b3d0
trap_pfault() at trap_pfault+0x62/frame 0xfffffe2f4351b420
trap() at trap+0x2b3/frame 0xfffffe2f4351b530
calltrap() at calltrap+0x8/frame 0xfffffe2f4351b530
--- trap 0xc, rip =3D 0xffffffff80c84700, rsp =3D 0xfffffe2f4351b600, rbp =
=3D 0xfffffe2f4351b650 ---
in6_selecthlim() at in6_selecthlim+0x20/frame 0xfffffe2f4351b650
sctp_lowlevel_chunk_output() at sctp_lowlevel_chunk_output+0xeb2/frame 0xff=
fffe2f4351b790
sctp_chunk_output() at sctp_chunk_output+0x68c/frame 0xfffffe2f4351c110
sctp_timeout_handler() at sctp_timeout_handler+0x2d8/frame 0xfffffe2f4351c1=
80
softclock_call_cc() at softclock_call_cc+0x15b/frame 0xfffffe2f4351c230
softclock() at softclock+0x7c/frame 0xfffffe2f4351c260
intr_event_execute_handlers() at intr_event_execute_handlers+0x9a/frame 0xf=
ffffe2f4351c2a0
ithread_loop() at ithread_loop+0xb7/frame 0xfffffe2f4351c2f0
fork_exit() at fork_exit+0x84/frame 0xfffffe2f4351c330
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe2f4351c330
--- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---
KDB: enter: panic


>From what I've gathered so far, it appears that the issue is a use-after-fr=
ee where the SCTP stack gets an ifp pointer that's no longer valid. We've r=
eproduced this issue on multiple iflib-based drivers, including ixl and the=
 recently published ice driver code (available on phabricator).

Additionally, we cannot reproduce it on legacy-stack drivers for ixl, or a =
mellanox 100G board we have. This leads me to believe that it's an issue in=
 iflib rather than in the specific device drivers.

I am not sure exactly what's going wrong here... anyone have suggestions? I=
 thought it might be an issue of when ether_ifdetach is called. That functi=
on is supposed to clear all of the pre-existing routes from the route entry=
 list. I'm thinking maybe somehow a route gets added after ether_ifdetach i=
s called.

In the iflib_device_deregister function, ether_ifdetach is called just afte=
r iflib_stop, (which would call a device's if_stop routine), and then the t=
ask queues are shutdown, a driver's ifdi_detach handler is called, and the =
ifp is free'd at the end. In the ixl legacy driver, ether_ifdetach is calle=
d prior to the stop routine. However, in the mlx5 driver, it's called after=
 a call to close_locked()...

So I'm really not sure exactly what could cause a stale ifp pointer to get =
into the route entry list.

Thanks,
Jake



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?02874ECE860811409154E81DA85FBB589692E0D4>