Date: Sun, 10 Jun 2018 05:45:52 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 228858] panic when delivering knote to a process who has opened a kqueue() is dying Message-ID: <bug-228858-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D228858 Bug ID: 228858 Summary: panic when delivering knote to a process who has opened a kqueue() is dying Product: Base System Version: 11.0-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: siddharthtuli@gmail.com The following race can occur on multi-processor systems resulting in this panic. -- Race -- cpu x=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 cpu y process X dies =C2=A0 =C2=A0 =C2= =A0 =C2=A0=20=20=20=20=20=20=20=20=20 Process Y is sending knote() to kqueue() opened by X kqueue_close=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 knote (due to exec, exit, fork etc of any process) =C2=A0 =C2=A0 kqueue_drain =C2=A0 =C2=A0 =C2=A0 =C2=A0 KQ_LOCK(kq) << Acquired the lock=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 KQ_LOCK(kq) << sleep and loop in __mtx_lock_sleep=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A6. =C2=A0 =C2=A0 =C2=A0 =C2=A0 KQ_UNLOCK(kq) kqueue_destory =C2=A0 =C2=A0 mtx_destroy(&kq->kq_lock); =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 set MTX_UNOWNE= D|MTX_CONTESTED =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 __mtx_lock_sleep() =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 Panic because no owner and (MTX_UNOWNED|MTX_CONTESTED) are set free(kq, M_KQUEUE) Process X is listening to NOTE_EXIT|NOTE_EXEC|NOTE_TRACK|NOTE_TRACKERR on a= ll the running processes. When process X dies, kernel will close kqueue descriptor.=C2=A0 The kq_lock is therefore destroyed - MTX_UNOWNED|MTX_CONT= ESTED. Due to the above race, the other thread that is trying to deliver a knote()= to process X could panic in __mtx_lock_sleep() because it finds that lock is n= ot exclusively MTX_UNOWNED and tries to deref the owner of the lock (which is NULL). Other api=E2=80=99s like knote_fork() could also run into this probl= em. <2>fault virtual address=C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 0x3b0 <2>fault code =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D supervisor read data, = page not present <2>instruction pointer=C2=A0 =3D 0x20:0xffffffff804169f6 <2>stack pointer=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D= 0x28:0xfffffe011f1db9a0 <2>frame pointer=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D= 0x28:0xfffffe011f1dba10 <2>code segment =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D base 0x0, limit 0xfffff, ty= pe 0x1b <2> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =3D DPL 0, pres 1, long 1, def32 0, gran 1 <2>processor eflags =C2=A0 =C2=A0 =3D interrupt enabled, resume, IOPL =3D 0 <2>current process=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 9199= (rcp) <2>trap number=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 12 <2>panic: page fault (kgdb) bt #0=C2=A0 __curthread () at ./machine/pcpu.h:221 #1=C2=A0 doadump (textdump=3D1) at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_shutdown.c:313 #2=C2=A0 0xffffffff8042b93f in kern_reboot (howto=3D260) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_shutdown.c:381 #3=C2=A0 0xffffffff8042be9f in vpanic (fmt=3D<optimized out>, ap=3D<optimiz= ed out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_shutdown.c:792 #4=C2=A0 0xffffffff8042bee3 in panic (fmt=3D<unavailable>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_shutdown.c:705 #5=C2=A0 0xffffffff80572b51 in trap_fatal (frame=3D<optimized out>, eva=3D<= optimized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/amd64/trap.c:841 #6=C2=A0 0xffffffff80572d44 in trap_pfault (frame=3D0xfffffe011f1db8f0, usermode=3D<optimized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/amd64/trap.c:691 #7=C2=A0 0xffffffff805724dc in trap (frame=3D0xfffffe011f1db8f0) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/amd64/trap.c:442 #8=C2=A0 0xffffffff8055a661 in calltrap () =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/amd64/exception.S:236 #9=C2=A0 0xffffffff804169f6 in __mtx_lock_sleep (c=3D0xfffff8006ec00d18, tid=3D18446735282413282656, opts=3D0, file=3D0x0, line=3D96) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_mutex.c:435 #10 0xffffffff803f5175 in knote (list=3D0xfffff80089f6c5c0, hint=3D21474836= 48, lockflags=3D<optimized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_event.c:2047 #11 0xffffffff803fa46a in exit1 (td=3D0xfffff8011de8a560, rval=3D<optimized= out>, signo=3D0) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_exit.c:515 #12 0xffffffff803f9b7d in sys_sys_exit (td=3D0xfffff8006ec00d00, uap=3D<opt= imized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_exit.c:178 #13 0xffffffff8058297e in syscallenter (td=3D0xfffff8011de8a560, sa=3D<opti= mized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/ia32/../../kern/subr_syscall.c:1= 46 #14 ia32_syscall (frame=3D0xfffffe011f1dbc00) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/ia32/ia32_syscall.c:187 #15 0xffffffff8055ac45 in Xint0x80_syscall () =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/amd64/ia32/ia32_exception.S:73 fr #16 0x00000000c83791bb in ?? () (kgdb) fr 9 #9=C2=A0 0xffffffff804169f6 in __mtx_lock_sleep (c=3D0xfffff8006ec00d18, tid=3D18446735282413282656, opts=3D0, file=3D0x0, line=3D96) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/kern_mutex.c:435 =C2=A0433 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 v =3D m->= mtx_lock; =C2=A0434 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (v != =3D MTX_UNOWNED) { =C2=A0435 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 owner =3D (struct thread *)(v & ~MTX_FLAGMASK);=C2=A0 =C2=A0436 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 if (TD_IS_RUNNING(owner)) { <=3D=3D=3D=3D Owner is NUL= L. Trying to access offset 0x3b0 resulting in page fault #define TD_IS_RUNNING(td) =C2=A0 =C2=A0 =C2=A0 ((td)->td_state =3D=3D TDS_R= UNNING) (kgdb)=C2=A0 p &(*(struct thread*)0)->td_state $7 =3D (enum {...} *) 0x3b0 (kgdb) p c $2 =3D (volatile uintptr_t *) 0xfffff8006ec00d18 (kgdb) pt struct mtx type =3D struct mtx { =C2=A0 =C2=A0 struct lock_object lock_object; =C2=A0 =C2=A0 volatile uintptr_t mtx_lock; } (kgdb) p &(*(struct mtx*)0)->mtx_lock $4 =3D (volatile uintptr_t *) 0x18 (kgdb) p *(struct mtx*)(0xfffff8006ec00d18-0x18) $6 =3D { =C2=A0 lock_object =3D { =C2=A0 =C2=A0 lo_name =3D 0xffffffff805fdf06 "kqueue",=C2=A0 =C2=A0 =C2=A0 lo_flags =3D 21102592,=C2=A0 =C2=A0 =C2=A0 lo_data =3D 0,=C2=A0 =C2=A0 =C2=A0 lo_witness =3D 0x0 =C2=A0 },=C2=A0 =C2=A0 mtx_lock =3D 6 <=3D=3D=3D MTX_UNOWNED|MTX_CONTESTED } =3D=3D=3D> from =E2=80=9Cinfo threads=E2=80=9D =C2=A0 40 =C2=A0 Thread 100237 (PID=3D5863: pmond) sched_switch (td=3D0xfff= ff8006e2a5560, newtd=3D<optimized out>, flags=3D<optimized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/sched_ule.c:1979 =C2=A0(kgdb) =C2=A0 thread 40 [Switching to thread 40 (Thread 100237)] #0=C2=A0 sched_switch (td=3D0xfffff8006e2a5560, newtd=3D<optimized out>, flags=3D<optimized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/sched_ule.c:1979 1979=C2=A0 =C2=A0 /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/sched_ule.c: No such file or directory. (kgdb) bt #0=C2=A0 sched_switch (td=3D0xfffff8006e2a5560, newtd=3D<optimized out>, flags=3D<optimized out>) =C2=A0 =C2=A0 at /.amd/svl-engdata1vs1/occamdev/build/freebsd/stable_11/20180413.165755_fbsd= -builder_stable_11.0.dc8ec62/src/sys/kern/sched_ule.c:1979 #1=C2=A0 0x0000000000000000 in ?? () (kgdb) p td->td_proc->p_satete There is no member named p_satete. (kgdb) p td->td_proc->p_state $13 =3D PRS_ZOMBIE (kgdb) --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-228858-227>