From owner-freebsd-bugs@freebsd.org Sun Jun 10 05:45:54 2018 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 56A0B100A3C9 for ; Sun, 10 Jun 2018 05:45:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id E5B1B6DB25 for ; Sun, 10 Jun 2018 05:45:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id A65AF100A3C4; Sun, 10 Jun 2018 05:45:53 +0000 (UTC) Delivered-To: bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8403A100A3BA for ; Sun, 10 Jun 2018 05:45:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 071D96DB22 for ; Sun, 10 Jun 2018 05:45:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 58A8B19077 for ; Sun, 10 Jun 2018 05:45:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w5A5jqto006086 for ; Sun, 10 Jun 2018 05:45:52 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w5A5jq3u006084 for bugs@FreeBSD.org; Sun, 10 Jun 2018 05:45:52 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 228858] panic when delivering knote to a process who has opened a kqueue() is dying Date: Sun, 10 Jun 2018 05:45:52 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: siddharthtuli@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Jun 2018 05:45:54 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D228858 Bug ID: 228858 Summary: panic when delivering knote to a process who has opened a kqueue() is dying Product: Base System Version: 11.0-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: siddharthtuli@gmail.com The following race can occur on multi-processor systems resulting in this panic. -- Race -- cpu x=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 cpu y process X dies =C2=A0 =C2=A0 =C2= =A0 =C2=A0=20=20=20=20=20=20=20=20=20 Process Y is sending knote() to kqueue() opened by X kqueue_close=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 knote (due to exec, exit, fork etc of any process) =C2=A0 =C2=A0 kqueue_drain =C2=A0 =C2=A0 =C2=A0 =C2=A0 KQ_LOCK(kq) << Acquired the lock=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 KQ_LOCK(kq) << sleep and loop in __mtx_lock_sleep=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A6. =C2=A0 =C2=A0 =C2=A0 =C2=A0 KQ_UNLOCK(kq) kqueue_destory =C2=A0 =C2=A0 mtx_destroy(&kq->kq_lock); =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 set MTX_UNOWNE= D|MTX_CONTESTED =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 __mtx_lock_sleep() =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 Panic because no owner and (MTX_UNOWNED|MTX_CONTESTED) are set free(kq, M_KQUEUE) Process X is listening to NOTE_EXIT|NOTE_EXEC|NOTE_TRACK|NOTE_TRACKERR on a= ll the running processes. When process X dies, kernel will close kqueue descriptor.=C2=A0 The kq_lock is therefore destroyed - MTX_UNOWNED|MTX_CONT= ESTED. Due to the above race, the other thread that is trying to deliver a knote()= to process X could panic in __mtx_lock_sleep() because it finds that lock is n= ot exclusively MTX_UNOWNED and tries to deref the owner of the lock (which is NULL). Other api=E2=80=99s like knote_fork() could also run into this probl= em. <2>fault virtual address=C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 0x3b0 <2>fault code =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D supervisor read data, = page not present <2>instruction pointer=C2=A0 =3D 0x20:0xffffffff804169f6 <2>stack pointer=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D= 0x28:0xfffffe011f1db9a0 <2>frame pointer=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D= 0x28:0xfffffe011f1dba10 <2>code segment =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D base 0x0, limit 0xfffff, ty= pe 0x1b <2> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =3D DPL 0, pres 1, long 1, def32 0, gran 1 <2>processor eflags =C2=A0 =C2=A0 =3D interrupt enabled, resume, IOPL =3D 0 <2>current process=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 9199= (rcp) <2>trap number=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 12 <2>panic: page fault (kgdb) bt #0=C2=A0 __curthread () at ./machine/pcpu.h:221 #1=C2=A0 doadump (textdump=3D1) at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_shutdown.c:313 #2=C2=A0 0xffffffff8042b93f in kern_reboot (howto=3D260) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_shutdown.c:381 #3=C2=A0 0xffffffff8042be9f in vpanic (fmt=3D, ap=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_shutdown.c:792 #4=C2=A0 0xffffffff8042bee3 in panic (fmt=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_shutdown.c:705 #5=C2=A0 0xffffffff80572b51 in trap_fatal (frame=3D, eva=3D<= optimized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/amd64/trap.c:841 #6=C2=A0 0xffffffff80572d44 in trap_pfault (frame=3D0xfffffe011f1db8f0, usermode=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/amd64/trap.c:691 #7=C2=A0 0xffffffff805724dc in trap (frame=3D0xfffffe011f1db8f0) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/amd64/trap.c:442 #8=C2=A0 0xffffffff8055a661 in calltrap () =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/amd64/exception.S:236 #9=C2=A0 0xffffffff804169f6 in __mtx_lock_sleep (c=3D0xfffff8006ec00d18, tid=3D18446735282413282656, opts=3D0, file=3D0x0, line=3D96) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_mutex.c:435 #10 0xffffffff803f5175 in knote (list=3D0xfffff80089f6c5c0, hint=3D21474836= 48, lockflags=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_event.c:2047 #11 0xffffffff803fa46a in exit1 (td=3D0xfffff8011de8a560, rval=3D, signo=3D0) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_exit.c:515 #12 0xffffffff803f9b7d in sys_sys_exit (td=3D0xfffff8006ec00d00, uap=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_exit.c:178 #13 0xffffffff8058297e in syscallenter (td=3D0xfffff8011de8a560, sa=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/ia32/../../kern/subr_syscall.c:1= 46 #14 ia32_syscall (frame=3D0xfffffe011f1dbc00) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/ia32/ia32_syscall.c:187 #15 0xffffffff8055ac45 in Xint0x80_syscall () =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/ia32/ia32_exception.S:73 fr #16 0x00000000c83791bb in ?? () (kgdb) fr 9 #9=C2=A0 0xffffffff804169f6 in __mtx_lock_sleep (c=3D0xfffff8006ec00d18, tid=3D18446735282413282656, opts=3D0, file=3D0x0, line=3D96) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_mutex.c:435 =C2=A0433 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 v =3D m->= mtx_lock; =C2=A0434 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (v != =3D MTX_UNOWNED) { =C2=A0435 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 owner =3D (struct thread *)(v & ~MTX_FLAGMASK);=C2=A0 =C2=A0436 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 if (TD_IS_RUNNING(owner)) { <=3D=3D=3D=3D Owner is NUL= L. Trying to access offset 0x3b0 resulting in page fault #define TD_IS_RUNNING(td) =C2=A0 =C2=A0 =C2=A0 ((td)->td_state =3D=3D TDS_R= UNNING) (kgdb)=C2=A0 p &(*(struct thread*)0)->td_state $7 =3D (enum {...} *) 0x3b0 (kgdb) p c $2 =3D (volatile uintptr_t *) 0xfffff8006ec00d18 (kgdb) pt struct mtx type =3D struct mtx { =C2=A0 =C2=A0 struct lock_object lock_object; =C2=A0 =C2=A0 volatile uintptr_t mtx_lock; } (kgdb) p &(*(struct mtx*)0)->mtx_lock $4 =3D (volatile uintptr_t *) 0x18 (kgdb) p *(struct mtx*)(0xfffff8006ec00d18-0x18) $6 =3D { =C2=A0 lock_object =3D { =C2=A0 =C2=A0 lo_name =3D 0xffffffff805fdf06 "kqueue",=C2=A0 =C2=A0 =C2=A0 lo_flags =3D 21102592,=C2=A0 =C2=A0 =C2=A0 lo_data =3D 0,=C2=A0 =C2=A0 =C2=A0 lo_witness =3D 0x0 =C2=A0 },=C2=A0 =C2=A0 mtx_lock =3D 6 <=3D=3D=3D MTX_UNOWNED|MTX_CONTESTED } =3D=3D=3D> from =E2=80=9Cinfo threads=E2=80=9D =C2=A0 40 =C2=A0 Thread 100237 (PID=3D5863: pmond) sched_switch (td=3D0xfff= ff8006e2a5560, newtd=3D, flags=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/sched_ule.c:1979 =C2=A0(kgdb) =C2=A0 thread 40 [Switching to thread 40 (Thread 100237)] #0=C2=A0 sched_switch (td=3D0xfffff8006e2a5560, newtd=3D, flags=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/sched_ule.c:1979 1979=C2=A0 =C2=A0 /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/sched_ule.c: No such file or directory. (kgdb) bt #0=C2=A0 sched_switch (td=3D0xfffff8006e2a5560, newtd=3D, flags=3D) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/sched_ule.c:1979 #1=C2=A0 0x0000000000000000 in ?? () (kgdb) p td->td_proc->p_satete There is no member named p_satete. (kgdb) p td->td_proc->p_state $13 =3D PRS_ZOMBIE (kgdb) --=20 You are receiving this mail because: You are the assignee for the bug.=