Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 08 Jan 2025 20:21:10 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 283938] panic due to race between ether_output() and ng_ether_detach()
Message-ID:  <bug-283938-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D283938

            Bug ID: 283938
           Summary: panic due to race between ether_output() and
                    ng_ether_detach()
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: chs@FreeBSD.org

while running the freebsd kyua test suite in a loop, I hit this panic:

Fatal trap 12: page fault while in kernel mode
cpuid =3D 4; apic id =3D 04
fault virtual address   =3D 0x30
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff83a5ca9f
stack pointer           =3D 0x28:0xfffffe02f90df790
frame pointer           =3D 0x28:0xfffffe02f90df7b0
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 0 (netlink_socket (PID)
rdi: fffff805a7b0c000 rsi: fffffe02f90df7c8 rdx: 9c58ffffffffffff
rcx: fffffe02f90df8c0  r8: 0608504e00fc9c58  r9: fffff811b4eff060
rax: 0000000000000000 rbx: fffff805a7b0c000 rbp: fffffe02f90df7b0
r10: 0000000000000000 r11: 0000000082e34944 r12: 000000000000000e
r13: 0000000000000000 r14: fffffe02f90df8c0 r15: fffff805a7b0c000
trap number             =3D 12
panic: page fault
cpuid =3D 4
time =3D 1733106093
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe02f90df=
480
vpanic() at vpanic+0x136/frame 0xfffffe02f90df5b0
panic() at panic+0x43/frame 0xfffffe02f90df610
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe02f90df670
trap_pfault() at trap_pfault+0x46/frame 0xfffffe02f90df6c0
calltrap() at calltrap+0x8/frame 0xfffffe02f90df6c0
--- trap 0xc, rip =3D 0xffffffff83a5ca9f, rsp =3D 0xfffffe02f90df790, rbp =
=3D
0xfffffe02f90df7b0 ---
ng_ether_output() at ng_ether_output+0xf/frame 0xfffffe02f90df7b0
ether_output() at ether_output+0x68d/frame 0xfffffe02f90df840
arprequest_internal() at arprequest_internal+0x394/frame 0xfffffe02f90df950
arp_ifinit() at arp_ifinit+0x6a/frame 0xfffffe02f90df9b0
ether_ioctl() at ether_ioctl+0x169/frame 0xfffffe02f90df9f0
tunifioctl() at tunifioctl+0x275/frame 0xfffffe02f90dfa30
in_control_ioctl() at in_control_ioctl+0xaff/frame 0xfffffe02f90dfad0
rtnl_handle_addr() at rtnl_handle_addr+0x412/frame 0xfffffe02f90dfcd0
rtnl_handle_message() at rtnl_handle_message+0x195/frame 0xfffffe02f90dfd30
nl_taskqueue_handler() at nl_taskqueue_handler+0x48e/frame 0xfffffe02f90dfe=
40
taskqueue_run_locked() at taskqueue_run_locked+0x182/frame 0xfffffe02f90dfe=
c0
taskqueue_thread_loop() at taskqueue_thread_loop+0xc2/frame 0xfffffe02f90df=
ef0
fork_exit() at fork_exit+0x7b/frame 0xfffffe02f90dff30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe02f90dff30
--- trap 0, rip =3D 0, rsp =3D 0x3fc2a3a098be, rbp =3D 0x4 ---
KDB: enter: panic
[ thread pid 0 tid 812581 ]
Stopped at      kdb_enter+0x33: movq    $0,0x10510c2(%rip)


kgdb showed:
(kgdb) up
#8  ng_ether_output (ifp=3D0xfffff805a7b0c000, mp=3D0xfffffe02f90df7c8)
    at /usr/src/sys/netgraph/ng_ether.c:284
284             const priv_p priv =3D NG_NODE_PRIVATE(node);
(kgdb) p node
$3 =3D (const node_p) 0x0
(kgdb) p ifp
$4 =3D (struct ifnet *) 0xfffff805a7b0c000
(kgdb) p ifp->if_l2com
$7 =3D (void *) 0x0


the code in ether_output() is

        /* Handle ng_ether(4) processing, if any */
        if (ifp->if_l2com !=3D NULL) {
                KASSERT(ng_ether_output_p !=3D NULL,
                    ("ng_ether_output_p is NULL"));
                if ((error =3D (*ng_ether_output_p)(ifp, &m)) !=3D 0) {


and then ng_ether_output() does

#define IFP2NG(ifp)  ((ifp)->if_l2com)

static int
ng_ether_output(struct ifnet *ifp, struct mbuf **mp)
{
        const node_p node =3D IFP2NG(ifp);
        const priv_p priv =3D NG_NODE_PRIVATE(node);


so ifp->if_l2com must have been non-NULL a few instructions earlier or we
wouldn't have called ng_ether_output(), but then it was NULL by the time th=
at
ng_ether_output() looked at it.  this looks like a race between ether_outpu=
t()
and something that cleared ifp->if_l2com, and the only functions that clear
ifp->if_l2com are ng_ether_detach() and ng_ether_shutdown().  and indeed,
ng_ether_detach() has a comment noting this race:

        IFP2NG(ifp) =3D NULL;
        priv->ifp =3D NULL;       /* XXX race if interrupted an output pack=
et */


I can reproduce this panic in less than a day by running the entire kyua te=
st
suite in a loop, but it would probably trigger much more quickly if the
non-network parts of the tests were skipped.  skipping the "sys/netgraph" t=
ests
and just running the rest of the test suite in a loop results in this crash=
 not
being triggered.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-283938-227>