Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 31 Aug 2018 19:45:41 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 231064] data abort in in_pcbremlbgrouphash() on ThunderX
Message-ID:  <bug-231064-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D231064

            Bug ID: 231064
           Summary: data abort in in_pcbremlbgrouphash() on ThunderX
           Product: Base System
           Version: CURRENT
          Hardware: arm64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: markj@FreeBSD.org

I'm testing -ALPHA3 on a packet.net ThunderX.  When I boot GENERIC-NODEBUG,=
 the
kernel panics right about the time it gets to the login prompt:

(kgdb) bt
#0  doadump (textdump=3D0) at /usr/src/sys/kern/kern_shutdown.c:366
#1  0xffff00000018f520 in db_dump (dummy=3D-281474967580032, dummy2=3Dfalse,
dummy3=3D-1, dummy4=3D0xffff00014d3cdb4c "") at /usr/src/sys/ddb/db_command=
.c:574
#2  0xffff00000018f298 in db_command (last_cmdp=3D0xffff000001018258
<db_last_command>, cmd_table=3D0x0, dopager=3D1) at
/usr/src/sys/ddb/db_command.c:481
#3  0xffff00000018edc8 in db_command_loop () at
/usr/src/sys/ddb/db_command.c:534
#4  0xffff0000001951e0 in db_trap (type=3D37, code=3D0) at
/usr/src/sys/ddb/db_main.c:252
#5  0xffff0000007050c0 in kdb_trap (type=3D37, code=3D0, tf=3D0xffff00014d3=
ce1e0) at
/usr/src/sys/kern/subr_kdb.c:693
#6  0xffff000000c8bec8 in data_abort (td=3D0xfffffd006112f000,
frame=3D0xffff00014d3ce1e0, esr=3D2516582404, far=3D16777259, lower=3D0)
    at /usr/src/sys/arm64/arm64/trap.c:261
#7  0xffff000000c8b858 in do_el1h_sync (td=3D0xfffffd006112f000,
frame=3D0xffff00014d3ce1e0) at /usr/src/sys/arm64/arm64/trap.c:341
#8  <signal handler called>
#9  0xffff0000008b5280 in in_pcbremlbgrouphash (inp=3D0xfffffd00e975a9b0) at
/usr/src/sys/netinet/in_pcb.c:414
#10 0xffff0000008b504c in in_pcbdrop (inp=3D0xfffffd00e975a9b0) at
/usr/src/sys/netinet/in_pcb.c:1687
#11 0xffff0000009d4eb4 in tcp_close (tp=3D0xfffffd00e975d3d0) at
/usr/src/sys/netinet/tcp_subr.c:1991
#12 0xffff0000009c13c0 in tcp_do_segment (m=3D0xfffffd0049dfe100,
th=3D0xfffffd0049e6b0a8, so=3D0xfffffd007bbfd000, tp=3D0xfffffd00e975d3d0,
drop_hdrlen=3D52,=20
    tlen=3D31, iptos=3D0 '\000') at /usr/src/sys/netinet/tcp_input.c:2306
#13 0xffff0000009be02c in tcp_input (mp=3D0xffff00014d3ceff8,
offp=3D0xffff00014d3cefd0, proto=3D6) at /usr/src/sys/netinet/tcp_input.c:1=
392
#14 0xffff0000008c203c in ip_input (m=3D0x0) at
/usr/src/sys/netinet/ip_input.c:827
#15 0xffff000000877330 in netisr_dispatch_src (proto=3D1, source=3D0,
m=3D0xfffffd0049dfe100) at /usr/src/sys/net/netisr.c:1122
#16 0xffff000000877ac4 in netisr_dispatch (proto=3D1, m=3D0xfffffd0049dfe10=
0) at
/usr/src/sys/net/netisr.c:1213
#17 0xffff0000008468a0 in ether_demux (ifp=3D0xfffffd0049a02000,
m=3D0xfffffd0049dfe100) at /usr/src/sys/net/if_ethersubr.c:874
#18 0xffff000000848fbc in ether_input_internal (ifp=3D0xfffffd0049a02000,
m=3D0xfffffd0049dfe100) at /usr/src/sys/net/if_ethersubr.c:662
#19 0xffff0000008487e0 in ether_nh_input (m=3D0xfffffd0049dfe100) at
/usr/src/sys/net/if_ethersubr.c:692
#20 0xffff000000877330 in netisr_dispatch_src (proto=3D5, source=3D0,
m=3D0xfffffd0049dfe100) at /usr/src/sys/net/netisr.c:1122
#21 0xffff000000877ac4 in netisr_dispatch (proto=3D5, m=3D0xfffffd0049dfe10=
0) at
/usr/src/sys/net/netisr.c:1213
#22 0xffff000000847100 in ether_input (ifp=3D0xfffffd00498e4800,
m=3D0xfffffd0049dfe100) at /usr/src/sys/net/if_ethersubr.c:782
#23 0xffff0000009c5d6c in tcp_lro_flush (lc=3D0xffff000149546788,
le=3D0xfffffd000ae25bf0) at /usr/src/sys/netinet/tcp_lro.c:397
#24 0xffff0000009c6c78 in tcp_lro_rx2 (lc=3D0xffff000149546788,
m=3D0xfffffd0049dfe000, csum=3D56586, use_hash=3D1) at
/usr/src/sys/netinet/tcp_lro.c:785
#25 0xffff0000009c7414 in tcp_lro_rx (lc=3D0xffff000149546788,
m=3D0xfffffd0049dfe000, csum=3D0) at /usr/src/sys/netinet/tcp_lro.c:952
#26 0xffff000000ce1b80 in nicvf_rcv_pkt_handler (nic=3D0xfffffd00330d1000,
cq=3D0xffff000149547480, cqe_rx=3D0xffff00016f402800, cqe_type=3D2)
    at /usr/src/sys/dev/vnic/nicvf_queues.c:678
#27 0xffff000000ce181c in nicvf_cq_intr_handler (nic=3D0xfffffd00330d1000,
cq_idx=3D4 '\004') at /usr/src/sys/dev/vnic/nicvf_queues.c:774
#28 0xffff000000ce1424 in nicvf_cmp_task (arg=3D0xffff000149547480, pending=
=3D1) at
/usr/src/sys/dev/vnic/nicvf_queues.c:887
#29 0xffff00000072817c in taskqueue_run_locked (queue=3D0xfffffd004b261800)=
 at
/usr/src/sys/kern/subr_taskqueue.c:465
#30 0xffff00000072a304 in taskqueue_thread_loop (arg=3D0xffff000149547500) =
at
/usr/src/sys/kern/subr_taskqueue.c:757
#31 0xffff00000061d680 in fork_exit (callout=3D0xffff00000072a1a4
<taskqueue_thread_loop>, arg=3D0xffff000149547500, frame=3D0xffff00014d3cf9=
60)
    at /usr/src/sys/kern/kern_fork.c:1057
#32 <signal handler called>

Interestingly, the panic does not occur under GENERIC.  It does occur if I
recompile GENERIC-NODEBUG with -O0, so I'm able to get a usable kernel dump=
.=20
Clearly "grp" is a bogus pointer, but it's not clear where it comes from:

(kgdb) frame 9
#9  0xffff0000008b5280 in in_pcbremlbgrouphash (inp=3D0xfffffd00e975a9b0) at
/usr/src/sys/netinet/in_pcb.c:414=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20
414                     for (i =3D 0; i < grp->il_inpcnt; ++i) {
(kgdb) info local
pcbinfo =3D 0xffff0000e9851820
hdr =3D 0xffff000148a3bbb0
grp =3D 0xffffff
i =3D 0
(kgdb) p *hdr
$1 =3D {lh_first =3D 0x0}

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-231064-227>