From owner-freebsd-bugs@freebsd.org Mon Jun 18 11:36:41 2018 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1791E10005FB for ; Mon, 18 Jun 2018 11:36:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 9DDA77A7FE for ; Mon, 18 Jun 2018 11:36:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 5DF3110005FA; Mon, 18 Jun 2018 11:36:40 +0000 (UTC) Delivered-To: bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1D38C10005F9 for ; Mon, 18 Jun 2018 11:36:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 90D8C7A7FC for ; Mon, 18 Jun 2018 11:36:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id D32EE20F82 for ; Mon, 18 Jun 2018 11:36:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w5IBac0k068667 for ; Mon, 18 Jun 2018 11:36:38 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w5IBacCI068664 for bugs@FreeBSD.org; Mon, 18 Jun 2018 11:36:38 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 229106] intr_event_handle is unsafe with respect to interrupt handler list Date: Mon, 18 Jun 2018 11:36:38 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jun 2018 11:36:41 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D229106 Bug ID: 229106 Summary: intr_event_handle is unsafe with respect to interrupt handler list Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: avg@FreeBSD.org Created attachment 194354 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D194354&action= =3Dedit code changes to greately increase likelihood of the race I must state upfront that I discovered the issue through code review and I = had to make special arrangements to provoke the problem. The core of the issue is that intr_event_handle iterates the list of handle= rs, ie_handlers, without any protection whatsoever. Also, removal and installa= tion of a filter-only handler does not make any attempt to synchronize with with intr_event_handle. As such, it is possible (although very improbable) that intr_event_handle m= ay iterate into an element just before it is removed and derefence its pointer= to a next element after the former element is freed and the pointer is overwritten. This problem is only for a shared interrupts. When an interrupt is not shar= ed, then it should be disabled before its handler is torn down. Here is a stack trace of the crash: fault virtual address =3D 0xffffffffffffffff fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80b64ff0 stack pointer =3D 0x28:0xfffffe0000434970 frame pointer =3D 0x28:0xfffffe00004349b0 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 11 (idle: cpu2) trap number =3D 12 panic: page fault cpuid =3D 2 time =3D 1529319165 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0000434= 630 vpanic() at vpanic+0x1a3/frame 0xfffffe0000434690 panic() at panic+0x43/frame 0xfffffe00004346f0 trap_fatal() at trap_fatal+0x35f/frame 0xfffffe0000434740 trap_pfault() at trap_pfault+0x62/frame 0xfffffe0000434790 trap() at trap+0x2ba/frame 0xfffffe00004348a0 calltrap() at calltrap+0x8/frame 0xfffffe00004348a0 --- trap 0xc, rip =3D 0xffffffff80b64ff0, rsp =3D 0xfffffe0000434970, rbp = =3D 0xfffffe00004349b0 --- intr_event_handle() at intr_event_handle+0xa0/frame 0xfffffe00004349b0 intr_execute_handlers() at intr_execute_handlers+0x58/frame 0xfffffe0000434= 9e0 lapic_handle_intr() at lapic_handle_intr+0x6d/frame 0xfffffe0000434a20 Xapic_isr1() at Xapic_isr1+0xd0/frame 0xfffffe0000434a20 --- interrupt, rip =3D 0xffffffff80bd3b49, rsp =3D 0xfffffe0000434af0, rbp = =3D 0xfffffe0000434bb0 --- sched_idletd() at sched_idletd+0x4a9/frame 0xfffffe0000434bb0 fork_exit() at fork_exit+0x84/frame 0xfffffe0000434bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000434bf0 This is what I did to get the crash. 1. Found hardware with shared interrupts. Specifically, I had three USB OHCI controllers sharing PCI IRQ 18. 2. Modified the ohci driver, so that it installed a dummy filter instead of= its ithread handler. This made the driver non-functional, of course. 3. Modified IO-APIC code, so that it kept re-raising the interrupt thus increasing the chances of getting the race within a reasonable time frame. 4. Re-compiled kern_intr.c with QUEUE_MACRO_DEBUG_TRASH to make the race mo= re probable by immediately corrupting a removed handler. 5. Triggered the interrupt storm for IRQ 18. 6. Ran a continuous loop of devctl detach followed by devctl attach for ohci driver instances sharing the interrupt. All the code modifications are in the attachment. The devctl command line was: while true ; do devctl detach ohci3 && devctl attach pci0:0:19:1 ; devc= tl detach ohci4 && devctl attach pci0:0:20:5 ; done The rate of interrupts was about 570K per second: 569k ohci2 ohci The stack trace in kgdb: (kgdb) bt #0 __curthread () at ./machine/pcpu.h:231 #1 doadump (textdump=3D1) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:= 366 #2 0xffffffff80ba33e2 in kern_reboot (howto=3D260) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:446 #3 0xffffffff80ba39c3 in vpanic (fmt=3D, ap=3D0xfffffe00004= 346d0) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:863 #4 0xffffffff80ba3a13 in panic (fmt=3D) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:790 #5 0xffffffff8107c6ff in trap_fatal (frame=3D0xfffffe00004348b0, eva=3D18446744073709551615) at /usr/devel/svn/head/sys/amd64/amd64/trap.c:8= 92 #6 0xffffffff8107c772 in trap_pfault (frame=3D0xfffffe00004348b0, usermode=3D) at /usr/devel/svn/head/sys/amd64/amd64/trap.c:7= 28 #7 0xffffffff8107bd7a in trap (frame=3D0xfffffe00004348b0) at /usr/devel/svn/head/sys/amd64/amd64/trap.c:427 #8 #9 intr_event_handle (ie=3D0xfffff80003349300, frame=3D0xfffffe0000434a30)= at /usr/devel/svn/head/sys/kern/kern_intr.c:1180 #10 0xffffffff811f2118 in intr_execute_handlers (isrc=3D0xfffff800033845b0, frame=3D0xfffffe0000434a30) at /usr/devel/svn/head/sys/x86/x86/intr_machdep= .c:285 #11 0xffffffff811f841d in lapic_handle_intr (vector=3D49, frame=3D0xfffffe0000434a30) at /usr/devel/svn/head/sys/x86/x86/local_apic.c= :1270 #12 #13 sched_idletd (dummy=3D) at /usr/devel/svn/head/sys/kern/sched_ule.c:2803 #14 0xffffffff80b62204 in fork_exit (callout=3D0xffffffff80bd36a0 , arg=3D0x0, frame=3D0xfffffe0000434c00) at /usr/devel/svn/head/sys/kern/kern_fork.c:1039 --=20 You are receiving this mail because: You are the assignee for the bug.=