Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Jun 2018 11:36:38 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 229106] intr_event_handle is unsafe with respect to interrupt handler list
Message-ID:  <bug-229106-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D229106

            Bug ID: 229106
           Summary: intr_event_handle is unsafe with respect to interrupt
                    handler list
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: avg@FreeBSD.org

Created attachment 194354
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D194354&action=
=3Dedit
code changes to greately increase likelihood of the race

I must state upfront that I discovered the issue through code review and I =
had
to make special arrangements to provoke the problem.
The core of the issue is that intr_event_handle iterates the list of handle=
rs,
ie_handlers, without any protection whatsoever.  Also, removal and installa=
tion
of a filter-only handler does not make any attempt to synchronize with with
intr_event_handle.

As such, it is possible (although very improbable) that intr_event_handle m=
ay
iterate into an element just before it is removed and derefence its pointer=
 to
a next element after the former element is freed and the pointer is
overwritten.

This problem is only for a shared interrupts. When an interrupt is not shar=
ed,
then it should be disabled before its handler is torn down.

Here is a stack trace of the crash:
fault virtual address   =3D 0xffffffffffffffff
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80b64ff0
stack pointer           =3D 0x28:0xfffffe0000434970
frame pointer           =3D 0x28:0xfffffe00004349b0
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 11 (idle: cpu2)
trap number             =3D 12
panic: page fault
cpuid =3D 2
time =3D 1529319165
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0000434=
630
vpanic() at vpanic+0x1a3/frame 0xfffffe0000434690
panic() at panic+0x43/frame 0xfffffe00004346f0
trap_fatal() at trap_fatal+0x35f/frame 0xfffffe0000434740
trap_pfault() at trap_pfault+0x62/frame 0xfffffe0000434790
trap() at trap+0x2ba/frame 0xfffffe00004348a0
calltrap() at calltrap+0x8/frame 0xfffffe00004348a0
--- trap 0xc, rip =3D 0xffffffff80b64ff0, rsp =3D 0xfffffe0000434970, rbp =
=3D
0xfffffe00004349b0 ---
intr_event_handle() at intr_event_handle+0xa0/frame 0xfffffe00004349b0
intr_execute_handlers() at intr_execute_handlers+0x58/frame 0xfffffe0000434=
9e0
lapic_handle_intr() at lapic_handle_intr+0x6d/frame 0xfffffe0000434a20
Xapic_isr1() at Xapic_isr1+0xd0/frame 0xfffffe0000434a20
--- interrupt, rip =3D 0xffffffff80bd3b49, rsp =3D 0xfffffe0000434af0, rbp =
=3D
0xfffffe0000434bb0 ---
sched_idletd() at sched_idletd+0x4a9/frame 0xfffffe0000434bb0
fork_exit() at fork_exit+0x84/frame 0xfffffe0000434bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000434bf0

This is what I did to get the crash.
1. Found hardware with shared interrupts. Specifically, I had three USB OHCI
controllers sharing PCI IRQ 18.
2. Modified the ohci driver, so that it installed a dummy filter instead of=
 its
ithread handler.  This made the driver non-functional, of course.
3. Modified IO-APIC code, so that it kept re-raising the interrupt thus
increasing the chances of getting the race within a reasonable time frame.
4. Re-compiled kern_intr.c with QUEUE_MACRO_DEBUG_TRASH to make the race mo=
re
probable by immediately corrupting a removed handler.
5. Triggered the interrupt storm for IRQ 18.
6. Ran a continuous loop of devctl detach followed by devctl attach for ohci
driver instances sharing the interrupt.

All the code modifications are in the attachment.
The devctl command line was:
    while true ; do devctl detach ohci3 && devctl attach pci0:0:19:1 ; devc=
tl
detach ohci4 && devctl attach pci0:0:20:5 ; done

The rate of interrupts was about 570K per second:
    569k ohci2 ohci

The stack trace in kgdb:
(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:231
#1  doadump (textdump=3D1) at /usr/devel/svn/head/sys/kern/kern_shutdown.c:=
366
#2  0xffffffff80ba33e2 in kern_reboot (howto=3D260) at
/usr/devel/svn/head/sys/kern/kern_shutdown.c:446
#3  0xffffffff80ba39c3 in vpanic (fmt=3D<optimized out>, ap=3D0xfffffe00004=
346d0)
at /usr/devel/svn/head/sys/kern/kern_shutdown.c:863
#4  0xffffffff80ba3a13 in panic (fmt=3D<unavailable>) at
/usr/devel/svn/head/sys/kern/kern_shutdown.c:790
#5  0xffffffff8107c6ff in trap_fatal (frame=3D0xfffffe00004348b0,
eva=3D18446744073709551615) at /usr/devel/svn/head/sys/amd64/amd64/trap.c:8=
92
#6  0xffffffff8107c772 in trap_pfault (frame=3D0xfffffe00004348b0,
usermode=3D<optimized out>) at /usr/devel/svn/head/sys/amd64/amd64/trap.c:7=
28
#7  0xffffffff8107bd7a in trap (frame=3D0xfffffe00004348b0) at
/usr/devel/svn/head/sys/amd64/amd64/trap.c:427
#8  <signal handler called>
#9  intr_event_handle (ie=3D0xfffff80003349300, frame=3D0xfffffe0000434a30)=
 at
/usr/devel/svn/head/sys/kern/kern_intr.c:1180
#10 0xffffffff811f2118 in intr_execute_handlers (isrc=3D0xfffff800033845b0,
frame=3D0xfffffe0000434a30) at /usr/devel/svn/head/sys/x86/x86/intr_machdep=
.c:285
#11 0xffffffff811f841d in lapic_handle_intr (vector=3D49,
frame=3D0xfffffe0000434a30) at /usr/devel/svn/head/sys/x86/x86/local_apic.c=
:1270
#12 <signal handler called>
#13 sched_idletd (dummy=3D<optimized out>) at
/usr/devel/svn/head/sys/kern/sched_ule.c:2803
#14 0xffffffff80b62204 in fork_exit (callout=3D0xffffffff80bd36a0 <sched_id=
letd>,
arg=3D0x0, frame=3D0xfffffe0000434c00) at
/usr/devel/svn/head/sys/kern/kern_fork.c:1039

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-229106-227>