Date: Mon, 18 Jun 2018 11:36:38 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 229106] intr_event_handle is unsafe with respect to interrupt handler list Message-ID: <bug-229106-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D229106 Bug ID: 229106 Summary: intr_event_handle is unsafe with respect to interrupt handler list Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: avg@FreeBSD.org Created attachment 194354 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D194354&action= =3Dedit code changes to greately increase likelihood of the race I must state upfront that I discovered the issue through code review and I = had to make special arrangements to provoke the problem. The core of the issue is that intr_event_handle iterates the list of handle= rs, ie_handlers, without any protection whatsoever. Also, removal and installa= tion of a filter-only handler does not make any attempt to synchronize with with intr_event_handle. As such, it is possible (although very improbable) that intr_event_handle m= ay iterate into an element just before it is removed and derefence its pointer= to a next element after the former element is freed and the pointer is overwritten. This problem is only for a shared interrupts. When an interrupt is not shar= ed, then it should be disabled before its handler is torn down. Here is a stack trace of the crash: fault virtual address =3D 0xffffffffffffffff fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80b64ff0 stack pointer =3D 0x28:0xfffffe0000434970 frame pointer =3D 0x28:0xfffffe00004349b0 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 11 (idle: cpu2) trap number =3D 12 panic: page fault cpuid =3D 2 time =3D 1529319165 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0000434= 630 vpanic() at vpanic+0x1a3/frame 0xfffffe0000434690 panic() at panic+0x43/frame 0xfffffe00004346f0 trap_fatal() at trap_fatal+0x35f/frame 0xfffffe0000434740 trap_pfault() at trap_pfault+0x62/frame 0xfffffe0000434790 trap() at trap+0x2ba/frame 0xfffffe00004348a0 calltrap() at calltrap+0x8/frame 0xfffffe00004348a0 --- trap 0xc, rip =3D 0xffffffff80b64ff0, rsp =3D 0xfffffe0000434970, rbp = =3D 0xfffffe00004349b0 --- intr_event_handle() at intr_event_handle+0xa0/frame 0xfffffe00004349b0 intr_execute_handlers() at intr_execute_handlers+0x58/frame 0xfffffe0000434= 9e0 lapic_handle_intr() at lapic_handle_intr+0x6d/frame 0xfffffe0000434a20 Xapic_isr1() at Xapic_isr1+0xd0/frame 0xfffffe0000434a20 --- interrupt, rip =3D 0xffffffff80bd3b49, rsp =3D 0xfffffe0000434af0, rbp = =3D 0xfffffe0000434bb0 --- sched_idletd() at sched_idletd+0x4a9/frame 0xfffffe0000434bb0 fork_exit() at fork_exit+0x84/frame 0xfffffe0000434bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000434bf0 This is what I did to get the crash. 1. Found hardware with shared interrupts. Specifically, I had three USB OHCI controllers sharing PCI IRQ 18. 2. Modified the ohci driver, so that it installed a dummy filter instead of= its ithread handler. This made the driver non-functional, of course. 3. Modified IO-APIC code, so that it kept re-raising the interrupt thus increasing the chances of getting the race within a reasonable time frame. 4. Re-compiled kern_intr.c with QUEUE_MACRO_DEBUG_TRASH to make the race mo= re probable by immediately corrupting a removed handler. 5. Triggered the interrupt storm for IRQ 18. 6. Ran a continuous loop of devctl detach followed by devctl attach for ohci driver instances sharing the interrupt. All the code modifications are in the attachment. The devctl command line was: while true ; do devctl detach ohci3 && devctl attach pci0:0:19:1 ; devc= tl detach ohci4 && devctl attach pci0:0:20:5 ; done The rate of interrupts was about 570K per second: 569k ohci2 ohci The stack trace in kgdb: (kgdb) bt #0 __curthread () at ./machine/pcpu.h:231 #1 doadump (textdump=3D1) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:= 366 #2 0xffffffff80ba33e2 in kern_reboot (howto=3D260) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:446 #3 0xffffffff80ba39c3 in vpanic (fmt=3D<optimized out>, ap=3D0xfffffe00004= 346d0) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:863 #4 0xffffffff80ba3a13 in panic (fmt=3D<unavailable>) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:790 #5 0xffffffff8107c6ff in trap_fatal (frame=3D0xfffffe00004348b0, eva=3D18446744073709551615) at /usr/devel/svn/head/sys/amd64/amd64/trap.c:8= 92 #6 0xffffffff8107c772 in trap_pfault (frame=3D0xfffffe00004348b0, usermode=3D<optimized out>) at /usr/devel/svn/head/sys/amd64/amd64/trap.c:7= 28 #7 0xffffffff8107bd7a in trap (frame=3D0xfffffe00004348b0) at /usr/devel/svn/head/sys/amd64/amd64/trap.c:427 #8 <signal handler called> #9 intr_event_handle (ie=3D0xfffff80003349300, frame=3D0xfffffe0000434a30)= at /usr/devel/svn/head/sys/kern/kern_intr.c:1180 #10 0xffffffff811f2118 in intr_execute_handlers (isrc=3D0xfffff800033845b0, frame=3D0xfffffe0000434a30) at /usr/devel/svn/head/sys/x86/x86/intr_machdep= .c:285 #11 0xffffffff811f841d in lapic_handle_intr (vector=3D49, frame=3D0xfffffe0000434a30) at /usr/devel/svn/head/sys/x86/x86/local_apic.c= :1270 #12 <signal handler called> #13 sched_idletd (dummy=3D<optimized out>) at /usr/devel/svn/head/sys/kern/sched_ule.c:2803 #14 0xffffffff80b62204 in fork_exit (callout=3D0xffffffff80bd36a0 <sched_id= letd>, arg=3D0x0, frame=3D0xfffffe0000434c00) at /usr/devel/svn/head/sys/kern/kern_fork.c:1039 --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-229106-227>