From owner-freebsd-bugs@freebsd.org Sat Jun 17 08:44:42 2017 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F13A8BF535B for ; Sat, 17 Jun 2017 08:44:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DE67C8A for ; Sat, 17 Jun 2017 08:44:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v5H8igb4039841 for ; Sat, 17 Jun 2017 08:44:42 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 220076] [patch] [panic] [netgraph] repeatable kernel panic due to a race in ng_iface(4) Date: Sat, 17 Jun 2017 08:44:42 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: eugen@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status keywords bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Jun 2017 08:44:43 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220076 Bug ID: 220076 Summary: [patch] [panic] [netgraph] repeatable kernel panic due to a race in ng_iface(4) Product: Base System Version: 11.0-STABLE Hardware: Any OS: Any Status: New Keywords: patch Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: eugen@freebsd.org Keywords: patch Created attachment 183566 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D183566&action= =3Dedit protect ng_iface private data I observe repeatable panics at netgraph level while doing stress test for net/mpd5 daemon under stable/11 r317184. It connects, uses and disconne= cts lots of ngXX interfaces and corresponding netgraph nodes and hooks. Crashdump points to ng_iface node which private data - set of hooks - may be modified in respond to userland request while another kernel thread sends d= ata over hook being disconnected. Here is a scenario: 1. mpd runs its BundNcpsLeave() procedure for an interface calling NgSendMsg(csock, path, NGM_GENERIC_COOKIE, NGM_RMHOOK, &rm, sizeof(rm)) that leads to libnetgraph's NgDeliverMsg() and sendto() system call for AF_NETGR= APH. The kernel reponds with ng_findhook->ng_destroy_hook->NG_HOOK_UNREF (_NG_HOOK_UNREF/ng_unref_hook)->NG_FREE_HOOK: free((hook), M_NETGRAPH_HOOK). 2. In parallel, userland process like ftpd sends some data over IPv4 socket= to corresponding interface being up and running. It may utilize hook being fre= ed same time by another kernel thread that leads to: Fatal trap 9: general protection fault while in kernel mode cpuid =3D 0; apic id =3D 00 instruction pointer =3D 0x20:0xffffffff8097f249 stack pointer =3D 0x28:0xfffffe0239542ec0 frame pointer =3D 0x28:0xfffffe0239542f00 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 28999 (ftpd) trap number =3D 9 panic: general protection fault cpuid =3D 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe0239542= 840 kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe0239542910 vpanic() at vpanic+0x249/frame 0xfffffe02395429e0 kproc_shutdown() at kproc_shutdown/frame 0xfffffe0239542a40 trap_fatal() at trap_fatal+0x60a/frame 0xfffffe0239542b70 trap() at trap+0x97c/frame 0xfffffe0239542dd0 trap_check() at trap_check+0x15/frame 0xfffffe0239542df0 calltrap() at calltrap+0x8/frame 0xfffffe0239542df0 --- trap 0x9, rip =3D 0xffffffff8097f249, rsp =3D 0xfffffe0239542ec0, rbp = =3D 0xfffffe0239542f00 --- ng_address_hook() at ng_address_hook+0x59/frame 0xfffffe0239542f00 ng_iface_send() at ng_iface_send+0x108/frame 0xfffffe0239542f90 ng_iface_output() at ng_iface_output+0x447/frame 0xfffffe0239543060 ip_output() at ip_output+0x1864/frame 0xfffffe0239543300 tcp_output() at tcp_output+0x2602/frame 0xfffffe02395436a0 tcp_disconnect() at tcp_disconnect+0x18e/frame 0xfffffe02395436e0 tcp_usr_disconnect() at tcp_usr_disconnect+0xe6/frame 0xfffffe0239543710 sodisconnect() at sodisconnect+0x62/frame 0xfffffe0239543740 soclose() at soclose+0x95/frame 0xfffffe02395437b0 soo_close() at soo_close+0x4d/frame 0xfffffe02395437e0 fo_close() at fo_close+0x31/frame 0xfffffe0239543810 _fdrop() at _fdrop+0x46/frame 0xfffffe0239543840 closef() at closef+0x2d7/frame 0xfffffe02395438f0 closefp() at closefp+0xde/frame 0xfffffe0239543940 kern_close() at kern_close+0xe7/frame 0xfffffe0239543990 sys_close() at sys_close+0x1f/frame 0xfffffe02395439b0 syscallenter() at syscallenter+0x4ff/frame 0xfffffe0239543a80 amd64_syscall() at amd64_syscall+0x2a/frame 0xfffffe0239543bb0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0239543bb0 --- syscall (6, FreeBSD ELF64, sys_close), rip =3D 0x801a4033a, rsp =3D 0x7fffffffd0a8, rbp =3D 0x7fffffffd0d0 --- Uptime: 2h11m5s Dumping 544 out of 8156 MB:..3%..12%..21%..33%..42%..53%..62%..71%..83%..92% Reading symbols from /boot/modules/geom_mirror.ko...done. Loaded symbols for /boot/modules/geom_mirror.ko Reading symbols from /boot/modules/accf_http.ko...done. Loaded symbols for /boot/modules/accf_http.ko Reading symbols from /boot/modules/nvidia.ko...done. Loaded symbols for /boot/modules/nvidia.ko Reading symbols from /boot/modules/vboxdrv.ko...done. Loaded symbols for /boot/modules/vboxdrv.ko Reading symbols from /boot/modules/mmc.ko...done. Loaded symbols for /boot/modules/mmc.ko Reading symbols from /boot/modules/mmcsd.ko...done. Loaded symbols for /boot/modules/mmcsd.ko Reading symbols from /boot/modules/sdhci.ko...done. Loaded symbols for /boot/modules/sdhci.ko Reading symbols from /boot/modules/h_ertt.ko...done. Loaded symbols for /boot/modules/h_ertt.ko Reading symbols from /boot/modules/cc_chd.ko...done. Loaded symbols for /boot/modules/cc_chd.ko Reading symbols from /boot/modules/geom_sched.ko...done. Loaded symbols for /boot/modules/geom_sched.ko Reading symbols from /boot/modules/gsched_rr.ko...done. Loaded symbols for /boot/modules/gsched_rr.ko Reading symbols from /boot/modules/vboxnetflt.ko...done. Loaded symbols for /boot/modules/vboxnetflt.ko Reading symbols from /boot/modules/vboxnetadp.ko...done. Loaded symbols for /boot/modules/vboxnetadp.ko Reading symbols from /boot/modules/nullfs.ko...done. Loaded symbols for /boot/modules/nullfs.ko Reading symbols from /usr/local/modules/rtc.ko...done. Loaded symbols for /usr/local/modules/rtc.ko #0 doadump (textdump=3D1) at /data2/src/sys/kern/kern_shutdown.c:298 298 dumptid =3D curthread->td_tid; (kgdb) bt #0 doadump (textdump=3D1) at /data2/src/sys/kern/kern_shutdown.c:298 #1 0xffffffff807a0828 in kern_reboot (howto=3D260) at /data2/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff807a125f in vpanic (fmt=3D0xffffffff80cf5311 "%s", ap=3D0xfffffe0239542a20) at /data2/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff807a12d0 in panic (fmt=3D0xffffffff80cf5311 "%s") at /data2/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff80c06a0a in trap_fatal (frame=3D0xfffffe0239542e00, eva=3D0) = at /data2/src/sys/amd64/amd64/trap.c:801 #5 0xffffffff80c0604c in trap (frame=3D0xfffffe0239542e00) at /data2/src/sys/amd64/amd64/trap.c:549 #6 0xffffffff80c07085 in trap_check (frame=3D0xfffffe0239542e00) at /data2/src/sys/amd64/amd64/trap.c:602 #7 0xffffffff80bdeba3 in calltrap () at /data2/src/sys/amd64/amd64/exception.S:236 #8 0xffffffff8097f249 in ng_address_hook (here=3D0x0, item=3D0xfffff8011dd= d1f00, hook=3D0xfffff801f8232300, retaddr=3D0) at /data2/src/sys/netgraph/ng_base.c:3586 #9 0xffffffff80986548 in ng_iface_send (ifp=3D0xfffff801f8ef2800, m=3D0xfffff801f8526100, sa=3D2 '\002') at /data2/src/sys/netgraph/ng_iface.c:451 #10 0xffffffff80985c97 in ng_iface_output (ifp=3D0xfffff801f8ef2800, m=3D0xfffff801f8526100, dst=3D0xfffff8011db98720,=20 ro=3D0xfffff8011db98700) at /data2/src/sys/netgraph/ng_iface.c:386 #11 0xffffffff809cae14 in ip_output (m=3D0xfffff801f8526100, opt=3D0x0, ro=3D0xfffff8011db98700, flags=3D0, imo=3D0x0,=20 inp=3D0xfffff8011db98570) at /data2/src/sys/netinet/ip_output.c:655 #12 0xffffffff809e08d2 in tcp_output (tp=3D0xfffff801f8258410) at /data2/src/sys/netinet/tcp_output.c:1446 #13 0xffffffff809f5f4e in tcp_disconnect (tp=3D0xfffff801f8258410) at /data2/src/sys/netinet/tcp_usrreq.c:1946 #14 0xffffffff809f29a6 in tcp_usr_disconnect (so=3D0xfffff8011d8de360) at /data2/src/sys/netinet/tcp_usrreq.c:674 #15 0xffffffff80884072 in sodisconnect (so=3D0xfffff8011d8de360) at /data2/src/sys/kern/uipc_socket.c:1051 #16 0xffffffff808839b5 in soclose (so=3D0xfffff8011d8de360) at /data2/src/sys/kern/uipc_socket.c:869 #17 0xffffffff8084d67d in soo_close (fp=3D0xfffff8011df23b40, td=3D0xfffff801f8c5d000) at /data2/src/sys/kern/sys_socket.c:334 #18 0xffffffff8072bee1 in fo_close (fp=3D0xfffff8011df23b40, td=3D0xfffff801f8c5d000) at file.h:346 #19 0xffffffff80726b86 in _fdrop (fp=3D0xfffff8011df23b40, td=3D0xfffff801f= 8c5d000) at /data2/src/sys/kern/kern_descrip.c:2849 #20 0xffffffff8072b1f7 in closef (fp=3D0xfffff8011df23b40, td=3D0xfffff801f= 8c5d000) at /data2/src/sys/kern/kern_descrip.c:2430 #21 0xffffffff8072768e in closefp (fdp=3D0xfffff80007104000, fd=3D6, fp=3D0xfffff8011df23b40, td=3D0xfffff801f8c5d000, holdleaders=3D0) at /data2/src/sys/kern/kern_descrip.c:1191 #22 0xffffffff80728417 in kern_close (td=3D0xfffff801f8c5d000, fd=3D6) at /data2/src/sys/kern/kern_descrip.c:1239 #23 0xffffffff8072831f in sys_close (td=3D0xfffff801f8c5d000, uap=3D0xfffffe0239543b58) at /data2/src/sys/kern/kern_descrip.c:1218 #24 0xffffffff80c07b7f in syscallenter (td=3D0xfffff801f8c5d000, sa=3D0xfffffe0239543b48) at subr_syscall.c:135 #25 0xffffffff80c0741a in amd64_syscall (td=3D0xfffff801f8c5d000, traced=3D= 0) at /data2/src/sys/amd64/amd64/trap.c:902 #26 0xffffffff80bdee8b in Xfast_syscall () at /data2/src/sys/amd64/amd64/exception.S:396 #27 0x0000000801a4033a in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) frame 8 #8 0xffffffff8097f249 in ng_address_hook (here=3D0x0, item=3D0xfffff8011dd= d1f00, hook=3D0xfffff801f8232300, retaddr=3D0) at /data2/src/sys/netgraph/ng_base.c:3586 3586 NG_HOOK_NOT_VALID(peer =3D NG_HOOK_PEER(hook)) || (kgdb) l 3581 * that the peer is still connected (even if invalid,) we k= now 3582 * that the peer node is present, though maybe invalid. 3583 */ 3584 TOPOLOGY_RLOCK(); 3585 if ((hook =3D=3D NULL) || NG_HOOK_NOT_VALID(hook) || 3586 NG_HOOK_NOT_VALID(peer =3D NG_HOOK_PEER(hook)) || 3587 NG_NODE_NOT_VALID(peernode =3D NG_PEER_NODE(hook))) { 3588 NG_FREE_ITEM(item); 3589 TRAP_ERROR(); 3590 TOPOLOGY_RUNLOCK(); (kgdb) p *hook $1 =3D { hk_name =3D 0xfffff801f8232300 "=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2= =82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83= =C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83= =E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3= =83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3= =83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD= =C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE= =C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2= =AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5= =BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82= =C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83= =C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3= =82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3= =83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC= =C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE= =C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82= =AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5= =BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2= =82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83= =C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83= =E2=82=AC=C3=82=C2=AD=C3=83=C5=BEP=C3=82=C2=AF\003\201=C3=83=C2=BF=C3=83=C2= =BF=C3=83=C2=BF=C3=83=C2=BF=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5= =BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2= =82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83= =C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83= =E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3= =83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3= =83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD= =C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE= =C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2= =AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5= =BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82= =C2=AD=C3=83=C5=BE=C3=83=C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE=C3=83= =C5=BE=C3=83=E2=82=AC=C3=82=C2=AD=C3=83=C5=BE"...,=20 hk_private =3D 0xdeadc0dedeadc0de, hk_flags =3D -559038242, hk_type =3D -= 559038242, hk_peer =3D 0xdeadc0dedeadc0de,=20 hk_node =3D 0xdeadc0dedeadc0de, hk_hooks =3D {le_next =3D 0xdeadc0dedeadc= 0de, le_prev =3D 0xdeadc0dedeadc0de},=20 hk_rcvmsg =3D 0xdeadc0dedeadc0de, hk_rcvdata =3D 0xdeadc0dedeadc0de, hk_r= efs =3D -559038242} (kgdb)=20 Attached patch introduces per-node rwlock for ng_iface to protect usage of = its private data while it is being modified. Without the patch, my stress test = for mpd procudes this panic in short time. With patch applied, it was running o= ver 11 hours non-stop and no panics. --=20 You are receiving this mail because: You are the assignee for the bug.=