Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 May 2020 21:22:10 +0000
From:      bugzilla-noreply@freebsd.org
To:        net@FreeBSD.org
Subject:   [Bug 246706] [netgraph] kernel panic due to corrupted memory
Message-ID:  <bug-246706-7501@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D246706

            Bug ID: 246706
           Summary: [netgraph] kernel panic due to corrupted memory
           Product: Base System
           Version: 11.3-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Keywords: panic
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: net@FreeBSD.org
          Reporter: eugen@freebsd.org
                CC: ae@FreeBSD.org, avg@FreeBSD.org, glebius@FreeBSD.org,
                    mav@FreeBSD.org, melifaro@FreeBSD.org

I run multiple routers using FreeBSD 11.3-STABLE/amd64 355108 and net/mpd5
daemon that dynamically creates/destroys ngXXX interfaces for multiple PPPoE
clients. Routers have ECC memory.

Since 11.1-RELEASE, the kernel was running it rock stable over 2 years until
yesterday one of routers paniced inside NETGRAPH code producing usable
crashdump and I have kernel.debug.

The server sends its logs to remote syslog collector and latest line sent
before panic was "Accepting PPPoE connection" produced by PppoeListenEvent()
function of mpd5 code:
https://sourceforge.net/p/mpd/svn/2239/tree/trunk/src/pppoe.c#l1356

Then mpd5 continued executing the function PppoeListenEvent() but an attemp=
t to
create ng_tee(4) node and connect it to ng_pppoe(4) by sending NGM_MKPEER
message resulted in kernel panic. Note that stock gdb 6.1.1 shows backtrace
incorrectly so I use gdb 9.1:

Reading symbols from /data/crash/PPPOE11/kernel.debug...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; apic id =3D 00
fault virtual address   =3D 0x40
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80624dc0
stack pointer           =3D 0x28:0xfffffe012499f6d0
frame pointer           =3D 0x28:0xfffffe012499f700
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 2576 (mpd5)
trap number             =3D 12
panic: page fault
cpuid =3D 0
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff802fda6b =3D
db_trace_self_wrapper+0x2b/frame 0xfffffe012499f380
vpanic() at 0xffffffff804f5e2e =3D vpanic+0x17e/frame 0xfffffe012499f3e0
panic() at 0xffffffff804f5ca3 =3D panic+0x43/frame 0xfffffe012499f440
trap_pfault() at 0xffffffff80778540 =3D trap_pfault/frame 0xfffffe012499f490
trap_pfault() at 0xffffffff80778589 =3D trap_pfault+0x49/frame 0xfffffe0124=
99f4f0
trap() at 0xffffffff80777c1d =3D trap+0x29d/frame 0xfffffe012499f600
calltrap() at 0xffffffff80758983 =3D calltrap+0x8/frame 0xfffffe012499f600
--- trap 0xc, rip =3D 0xffffffff80624dc0, rsp =3D 0xfffffe012499f6d0, rbp =
=3D
0xfffffe012499f700 ---
ng_add_hook() at 0xffffffff80624dc0 =3D ng_add_hook+0x20/frame 0xfffffe0124=
99f700
ng_mkpeer() at 0xffffffff80624a0c =3D ng_mkpeer+0x6c/frame 0xfffffe012499f7=
50
ng_apply_item() at 0xffffffff80622d7f =3D ng_apply_item+0x3ef/frame
0xfffffe012499f7d0
ng_snd_item() at 0xffffffff8062278e =3D ng_snd_item+0x17e/frame
0xfffffe012499f800
ngc_send() at 0xffffffff806329b3 =3D ngc_send+0x1a3/frame 0xfffffe012499f8a0
sosend_generic() at 0xffffffff805868ea =3D sosend_generic+0x4fa/frame
0xfffffe012499f950
kern_sendit() at 0xffffffff8058d246 =3D kern_sendit+0x286/frame
0xfffffe012499fa10
sendit() at 0xffffffff8058d591 =3D sendit+0x191/frame 0xfffffe012499fa70
sys_sendto() at 0xffffffff8058d3ed =3D sys_sendto+0x4d/frame 0xfffffe012499=
fac0
amd64_syscall() at 0xffffffff80778f18 =3D amd64_syscall+0x378/frame
0xfffffe012499fbf0
fast_syscall_common() at 0xffffffff80759290 =3D fast_syscall_common+0x101/f=
rame
0xfffffe012499fbf0
--- syscall (133, FreeBSD ELF64, sys_sendto), rip =3D 0x80279378a, rsp =3D
0x7fffdfffda08, rbp =3D 0x7fffdfffda50 ---
Uptime: 64d17h37m40s
Dumping 457 out of 4073 MB:..4%..11%..22%..32%..43%..53%..64%..71%..81%..92%

__curthread () at ./machine/pcpu.h:234
234             __asm("movq %%gs:%1,%0" : "=3Dr" (td)
(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:234
#1  doadump (textdump=3D1) at /home/src/sys/kern/kern_shutdown.c:320
#2  0xffffffff804f5a1d in kern_reboot (howto=3D260) at
/home/src/sys/kern/kern_shutdown.c:388
#3  0xffffffff804f5e68 in vpanic (fmt=3D<optimized out>, ap=3D0xfffffe01249=
9f420)
    at /home/src/sys/kern/kern_shutdown.c:784
#4  0xffffffff804f5ca3 in panic (fmt=3D<unavailable>) at
/home/src/sys/kern/kern_shutdown.c:715
#5  0xffffffff80778540 in trap_fatal (frame=3D0xfffffe012499f610, eva=3D64)
    at /home/src/sys/amd64/amd64/trap.c:899
#6  0xffffffff80778589 in trap_pfault (frame=3D0xfffffe012499f610, usermode=
=3D0)
    at /home/src/sys/amd64/amd64/trap.c:744
#7  0xffffffff80777c1d in trap (frame=3D0xfffffe012499f610) at
/home/src/sys/amd64/amd64/trap.c:438
#8  <signal handler called>
#9  0xffffffff80624dc0 in ng_findhook (node=3D0xfffff80092840600,
    name=3D0xfffff800921e9978 "left2right") at
/home/src/sys/netgraph/ng_base.c:1128
#10 ng_add_hook (node=3D0xfffff80092840600, name=3D0xfffff800921e9978 "left=
2right",
    hookp=3D0xfffffe012499f728) at /home/src/sys/netgraph/ng_base.c:1073
#11 0xffffffff80624a0c in ng_mkpeer (node=3D0xfffff8004f15fe00, name=3D<opt=
imized
out>,
    name2=3D0xfffff800921e9978 "left2right", type=3D<optimized out>)
    at /home/src/sys/netgraph/ng_base.c:1555
#12 0xffffffff80622d7f in ng_generic_msg (here=3D0xfffff8004f15fe00,
item=3D<optimized out>,
    lasthook=3D<optimized out>) at /home/src/sys/netgraph/ng_base.c:2537
#13 ng_apply_item (node=3D0xfffff8004f15fe00, item=3D0xfffff800423b5c00, rw=
=3D1)
    at /home/src/sys/netgraph/ng_base.c:2437
#14 0xffffffff8062278e in ng_snd_item (item=3D0xfffff800423b5c00, flags=3D0)
    at /home/src/sys/netgraph/ng_base.c:2320
#15 0xffffffff806329b3 in ngc_send (so=3D<optimized out>, flags=3D<optimize=
d out>,
    m=3D0xfffff80006d01000, addr=3D<optimized out>, control=3D<optimized ou=
t>,
td=3D<optimized out>)
--Type <RET> for more, q to quit, c to continue without paging--
    at /home/src/sys/netgraph/ng_socket.c:338
#16 0xffffffff805868ea in sosend_generic (so=3D0xfffff80006c0da38,
addr=3D0xfffff8004f6da9f0,
    uio=3D0xfffffe012499f980, top=3D0xfffff80006d01000, control=3D<optimize=
d out>,
    flags=3D<optimized out>, td=3D0xfffff8004f560000) at
/home/src/sys/kern/uipc_socket.c:1360
#17 0xffffffff8058d246 in kern_sendit (td=3D<optimized out>, s=3D2, mp=3D<o=
ptimized
out>, flags=3D0,
    control=3D0x0, segflg=3DUIO_USERSPACE) at
/home/src/sys/kern/uipc_syscalls.c:884
#18 0xffffffff8058d591 in sendit (td=3D0xfffff8004f560000, s=3D2,
mp=3D0xfffffe012499fa80, flags=3D-1)
    at /home/src/sys/kern/uipc_syscalls.c:804
#19 0xffffffff8058d3ed in sys_sendto (td=3D0xfffff80092840600, uap=3D<optim=
ized
out>)
    at /home/src/sys/kern/uipc_syscalls.c:935
#20 0xffffffff80778f18 in syscallenter (td=3D0xfffff8004f560000)
    at /home/src/sys/amd64/amd64/../../kern/subr_syscall.c:132
#21 amd64_syscall (td=3D0xfffff8004f560000, traced=3D0) at
/home/src/sys/amd64/amd64/trap.c:1014
#22 <signal handler called>
#23 0x000000080279378a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdfffda08

Note that "node" structure seems to be corrupted to the moment of panic:

(kgdb) frame 12
#12 0xffffffff80622d7f in ng_generic_msg (here=3D0xfffff8004f15fe00,
item=3D<optimized out>,
    lasthook=3D<optimized out>) at /home/src/sys/netgraph/ng_base.c:2537
2537                    error =3D ng_mkpeer(here, mkp->ourhook, mkp->peerho=
ok,
mkp->type);
(kgdb) p *mkp
$1 =3D {type =3D "l858", '\000' <repeats 27 times>,
  ourhook =3D
"=D0=AE-$O\000=D0=AC=D0=AA=D0=AA\000\000\000\000\000\000\000\000\000=D0=BAj=
\222\000=D0=AC=D0=AA=D0=AA\000=D0=A7\025O\000=D0=AC=D0=AA=D0=AA",
  peerhook =3D "\200]\a\222\000=D0=AC=D0=AA=D0=AA=D1=8E=D1=91\n\222\000=D0=
=AC=D0=AA=D0=AA", '\000' <repeats 15 times>}
(kgdb) frame 10
#10 ng_add_hook (node=3D0xfffff80092840600, name=3D0xfffff800921e9978 "left=
2right",
    hookp=3D0xfffffe012499f728) at /home/src/sys/netgraph/ng_base.c:1073
1073            if (ng_findhook(node, name) !=3D NULL) {
(kgdb) p *node
$2 =3D {nd_name =3D '\000' <repeats 31 times>, nd_type =3D 0x0, nd_flags =
=3D 0,
nd_numhooks =3D 0,
  nd_private =3D 0xfffff80092840600, nd_ID =3D 0, nd_hooks =3D {lh_first =
=3D 0x0},
nd_nodes =3D {
    le_next =3D 0x0, le_prev =3D 0x0}, nd_idnodes =3D {le_next =3D 0x0, le_=
prev =3D 0x0},
nd_input_queue =3D {
    q_flags =3D 0, q_flags2 =3D 0, q_mtx =3D {lock_object =3D {lo_name =3D =
0x0, lo_flags
=3D 0, lo_data =3D 0,
        lo_witness =3D 0x0}, mtx_lock =3D 0}, q_work =3D {stqe_next =3D 0x0=
}, queue =3D
{stqh_first =3D 0x0,
      stqh_last =3D 0x0}}, nd_refs =3D 0, nd_vnet =3D 0x0}
(kgdb) frame 9
#9  0xffffffff80624dc0 in ng_findhook (node=3D0xfffff80092840600,
    name=3D0xfffff800921e9978 "left2right") at
/home/src/sys/netgraph/ng_base.c:1128
1128            if (node->nd_type->findhook !=3D NULL)
(kgdb) p node->nd_type
$3 =3D (struct ng_type *) 0x0

Compressed crashdump and kernel.debug files are available here (101MB in
total):
http://www.grosbein.net/freebsd/crash/20200524/

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-246706-7501>