Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 15 Sep 2019 05:04:03 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 240589] r352110 would make graphics/drm-current-kmod trigger assertion for i915 devices
Message-ID:  <bug-240589-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D240589

            Bug ID: 240589
           Summary: r352110 would make graphics/drm-current-kmod trigger
                    assertion for i915 devices
           Product: Base System
           Version: CURRENT
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: delphij@FreeBSD.org

(I don't think I have fully understood the problem yet, this bug is mainly =
to
serve as a memo documenting what we have done so far and already know; Than=
ks
for markj@'s hints in how to get useful debugging information)

=3D=3D=3D
Set up kernel crash dump for DRM

Set the following sysctl's:
debug.debugger_on_panic=3D0
dev.drm.skip_ddb=3D1

As well as a dump device.

=3D=3D=3D

After setting this up, I was able to get a kernel crash dump, with the
following backtrace:

[drm:gen8_init_common_ring] Execlists enabled for rcs0
[drm:init_workarounds_ring] rcs0: Number of context specific w/a: 15
[drm:gen8_init_common_ring] Execlists enabled for bcs0
[drm:gen8_init_common_ring] Execlists enabled for vcs0
[drm:gen8_init_common_ring] Execlists enabled for vecs0
panic: vm_page_wire: page 0xfffffe000c2da0a8 does not belong to an object
cpuid =3D 7
time =3D 1568513321
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe00e4b12700
vpanic() at vpanic+0x19d/frame 0xfffffe00e4b12750
panic() at panic+0x43/frame 0xfffffe00e4b127b0
vm_page_wire() at vm_page_wire+0x9a/frame 0xfffffe00e4b127d0
gen8_ppgtt_cleanup() at gen8_ppgtt_cleanup+0xaf/frame 0xfffffe00e4b12810
i915_ppgtt_release() at i915_ppgtt_release+0x52/frame 0xfffffe00e4b12830
i915_gem_context_free() at i915_gem_context_free+0x1e0/frame
0xfffffe00e4b12850
contexts_free_worker() at contexts_free_worker+0x8d/frame 0xfffffe00e4b12880
linux_work_fn() at linux_work_fn+0xe7/frame 0xfffffe00e4b128e0
taskqueue_run_locked() at taskqueue_run_locked+0x10c/frame
0xfffffe00e4b12940
taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame
0xfffffe00e4b12970
fork_exit() at fork_exit+0x84/frame 0xfffffe00e4b129b0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e4b129b0
--- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---
Uptime: 2m58s

And the assertion was triggered here:

(kgdb) up
#1  0xffffffff80bd2830 in kern_reboot (howto=3D260) at
/usr/src/sys/kern/kern_shutdown.c:479
479                     doadump(TRUE);
Current language:  auto; currently minimal
(kgdb) up
#2  0xffffffff80bd2ca9 in vpanic (fmt=3D<value optimized out>, ap=3D<value
optimized out>) at /usr/src/sys/kern/kern_shutdown.c:908
908             kern_reboot(bootopt);
(kgdb) up
#3  0xffffffff80bd29e3 in panic (fmt=3D<value optimized out>) at
/usr/src/sys/kern/kern_shutdown.c:835
835             vpanic(fmt, ap);
(kgdb) up
#4  0xffffffff80f25c5a in vm_page_wire (m=3D<value optimized out>) at
src/sys/amd64/include/counter.h:85
85              __asm __volatile("addq\t%1,%%gs:(%0)"
(kgdb) up
#5  0xffffffff84db8d2f in gen8_ppgtt_cleanup (vm=3D0xfffffe015261a000) at
src/sys/compat/linuxkpi/common/include/linux/mm.h:230
230             vm_page_wire(page);
(kgdb) up
#6  0xffffffff84db4812 in i915_ppgtt_release (kref=3D<value optimized
out>) at
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/i915/i915_gem_gtt.c=
:2271
warning: Source file is more recent than executable.

2271            ppgtt->base.cleanup(&ppgtt->base);
(kgdb) list
2266            /* vmas should already be unbound and destroyed */
2267            WARN_ON(!list_empty(&ppgtt->base.active_list));
2268            WARN_ON(!list_empty(&ppgtt->base.inactive_list));
2269            WARN_ON(!list_empty(&ppgtt->base.unbound_list));
2270=20=20=20=20
2271            ppgtt->base.cleanup(&ppgtt->base);
2272            i915_address_space_fini(&ppgtt->base);
2273            kfree(ppgtt);
2274    }
2275=20=20=20=20

=3D=3D=3D

So basically, in r352110, vm_page_wire was modified to require a VM object,=
 and
the requirement is enforced as an assertion.

The Linux get_page() API basically do the same of wiring the page, but it's=
 not
yet clear to me whether we can always assert that the page is already mapped
(in FreeBSD's terms).

A quick hack would be to replace the vm_page_wire(page) call in
sys/compat/linuxkpi/common/include/linux/mm.h with an assertion that the
equivalent call of vm_page_wire_mapped(page) succeeded, and I am able to
get my laptop working again with CURRENT.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-240589-227>