Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Apr 2023 16:17:13 +0200
From:      =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= <uqs@freebsd.org>
To:        Mathias Picker <Mathias.Picker@virtual-earth.de>
Cc:        Cy Schubert <Cy.Schubert@cschubert.com>, Shane Ambler <FreeBSD@shaneware.biz>,  FreeBSD-STABLE <freebsd-stable@freebsd.org>, stable@freebsd.org
Subject:   Re: -stable from today dumps core with drm-510-kmod and some graphical clients
Message-ID:  <CAJ9axoSeFLHcXoUir%2BYmnGtCuo8hy6E_D9Fm8gAPOFBAV_zsDg@mail.gmail.com>
In-Reply-To: <86o7oa1i6t.fsf@virtual-earth.de>
References:  <86o7og27eh.fsf@virtual-earth.de> <8b47d0a4-a8f1-1841-ee59-3949fe69cbd7@ShaneWare.Biz> <20230327210535.9ED5A1D7@slippy.cwsent.com> <044587F7-4BA9-4585-A789-F4B53E8D02A2@virtual-earth.de> <20230327145629.3b55eed8@slippy> <86o7oa1i6t.fsf@virtual-earth.de>

index | next in thread | previous in thread | raw e-mail

[-- Attachment #1 --]
On Thu, Mar 30, 2023 at 3:29 PM Mathias Picker <
Mathias.Picker@virtual-earth.de> wrote:

>
> Cy Schubert <Cy.Schubert@cschubert.com> writes:
>
> > On Mon, 27 Mar 2023 23:43:35 +0200
> > Mathias Picker <Mathias.Picker@virtual-earth.de> wrote:
> >
> >> Am 27. März 2023 23:05:35 MESZ schrieb Cy Schubert
> >> <Cy.Schubert@cschubert.com>:
> >> >In message
> >> ><8b47d0a4-a8f1-1841-ee59-3949fe69cbd7@ShaneWare.Biz>, Shane
> >> >Ambler w
> >> >rites:
> >> >> On 26/3/23 01:37, Mathias Picker wrote:
> >> >> >
> >> >> > Starting sddm works fine, starting my normal session
> >> >> > crashes or freezes
> >> >> > FreeBSD.
> >> >> >
> >> >> > I can find no error messages after a reboot.
> >> >> >
> >> >> > I found out, that I can start xterm or emacs (exwm)
> >> >> > without problems,
> >> >> > xrandr works with external screen, but once I start
> >> >> > anything more
> >> >> > demanding (I guess demanding of the GPU) everything
> >> >> > freezes or FreeBSD
> >> >> > even reboots.
> >> >> >
> >> >> > “Demanding† means even simple things like
> >> >> > qterminal. I tried firefox an
> >> >> d
> >> >> > blender and then I had it with the reboots and
> >> >> > didn’t try anything else.
> >> >> > xedit works fine :)
> >> >> >
> >> >> > I have nothing in the logs, I have no idea where to look
> >> >> > or how to debug
> >> >> > this.
> >> >> >
> >> >> > Any ideas, tipps, help greatly apreciated.
> >> >>
> >> >>
> >> >> FreeBSD Developers Handbook Chapter 10: Kernel Debugging
> >> >>
> >> >> https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/
> >> >>
> >> >> Running stable, kernel dumps may already be enabled, look in
> >> >> /var/crash
> >> >>
> >> >> By enabling a kernel dump when it panics (dumpdev="AUTO" in
> >> >> rc.conf) the
> >> >> kernel core is saved to swap space, then on reboot gets
> >> >> copied to
> >> >> dumpdir (/var/crash) where you can then use kgdb (from
> >> >> devel/gdb) to get
> >> >> a stack trace to find where the panic happened.
> >> >
> >> >drm-*-kmod probably needs a rebuild. Likely a data structure
> >> >changed. In my
> >> >experience a simple rebuild of the port solves 90% of
> >> >drm-*-kmod crash
> >> >problems.
> >> >
> >> Hi Cy,
> >>
> >> sorry I didn't mention that, but I did rebuild drm-kmod, I
> >> actually do it after every new kernel build, just to be on the
> >> safe side.
> >>
> >> I switched my swap to non-encrypted and will look if I can get
> >> any information from the kernel dump tomorrow.
> >>
> >> Oh, and it's on a Thinkpad X1 Yoga 3rd gen, I just noticed I
> >> didn't mention this.
> >
> > It may be worth trying drm-515-kmod as some MFC that works with
> > 515 and
> > not 510 may have been committed. Linux-KPI commits are the usual
> > suspects.
> >
> > I use drm-515 with 14-CURRENT.
>
> Finally I found the time for a kernel crash dump.
> This is what kgdb says
>
> mathiasp:amd64.amd64/sys/GENERIC% sudo kgdb kernel
> /var/crash/vmcore.2
> GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
> Copyright (C) 2023 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>;
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "x86_64-portbld-freebsd13.1".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <https://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
>     <http://www.gnu.org/software/gdb/documentation/>.
>
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from kernel...
> Reading symbols from
> /usr/obj/usr/src/amd64.amd64/sys/GENERIC/kernel.debug...
>
> Unread portion of the kernel message buffer:
>
>
> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> 55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
> (offsetof(struct pcpu,
> (kgdb) backtrace
> #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> #1  doadump (textdump=<optimized out>) at
>  /usr/src/sys/kern/kern_shutdown.c:396
> #2  0xffffffff80c07c2a in kern_reboot (howto=260) at
>  /usr/src/sys/kern/kern_shutdown.c:484
> #3  0xffffffff80c080ce in vpanic (fmt=<optimized out>,
>  ap=ap@entry=0xfffffe01341fab50) at
>  /usr/src/sys/kern/kern_shutdown.c:923
> #4  0xffffffff80c07f03 in panic (fmt=<unavailable>) at
>  /usr/src/sys/kern/kern_shutdown.c:847
> #5  0xffffffff810c1fa7 in trap_fatal (frame=0xfffffe01341fac40,
>  eva=0) at /usr/src/sys/amd64/amd64/trap.c:942
> #6  0xffffffff810c1fff in trap_pfault (frame=0xfffffe01341fac40,
>  usermode=false, signo=<optimized out>, ucode=<optimized out>)
>     at /usr/src/sys/amd64/amd64/trap.c:761
> #7  <signal handler called>
> #8  0xffffffff84a07067 in shmem_get_pages () from
>  /boot/modules/i915kms.ko
> #9  0x0000000300000015 in ?? ()
> #10 0x0000000000000060 in ?? ()
> #11 0x0000000000000060 in ?? ()
> #12 0x0000000000060000 in ?? ()
> #13 0xfffffe00dc365a80 in ?? ()
> #14 0xfffff00100000060 in ?? ()
> #15 0xfffff8003e270c00 in ?? ()
> #16 0x00000000fffff000 in ?? ()
> #17 0xfffff8002138fc20 in ?? ()
> #18 0xfffffe00dc365a80 in ?? ()
> #19 0x0000000000000060 in ?? ()
> #20 0xfffff8003e270c00 in ?? ()
> #21 0x0000000000000060 in ?? ()
> #22 0xfffffe0131e0fc80 in ?? ()
> #23 0xfffffe01341fade0 in ?? ()
> #24 0xffffffff84a07596 in shmem_pwrite () from
>  /boot/modules/i915kms.ko
> #25 0x0000000000000000 in ?? ()
> (kgdb)
>
>
> Anything else I can do to help?
>
> I’m now building drm-515-kmod, let’s see how that works in
> -stable.
>
> /Mathias
>
>
Any updates here? I just ran into this myself and am very close to just
installing Linux on my laptop, tbh.

I've rebuilt stable/13 today, then rebuilt the 510-kmod (because the
515-kmod doesn't even build) and pretty much anything that's not an XTerm
will panic/reboot the machine (a Thinkpad T490 with Intel GPU).

dmesg got this to say:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff84430626
stack pointer           = 0x28:0xfffffe0140c83cf0
frame pointer           = 0x28:0xfffffe0140c83d70
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (i915-userptr-acquir)
trap number             = 12
panic: page fault
cpuid = 1
time = 1681221523
KDB: stack backtrace:
#0 0xffffffff80c5fc15 at kdb_backtrace+0x65
#1 0xffffffff80c12e02 at vpanic+0x152
#2 0xffffffff80c12ca3 at panic+0x43
#3 0xffffffff810d1577 at trap_fatal+0x387
#4 0xffffffff810d15cf at trap_pfault+0x4f
#5 0xffffffff810a8568 at calltrap+0x8
#6 0xffffffff84430c02 at __i915_gem_userptr_get_pages_worker+0x1f2
#7 0xffffffff80e80883 at linux_work_fn+0xe3
#8 0xffffffff80c746f1 at taskqueue_run_locked+0x181
#9 0xffffffff80c759b3 at taskqueue_thread_loop+0xc3
#10 0xffffffff80bcf55d at fork_exit+0x7d
#11 0xffffffff810a95de at fork_trampoline+0xe

It apparently dumps core, will have to reacquaint myself with how to poke
at this some more...

[-- Attachment #2 --]
<div dir="ltr"><div dir="ltr">On Thu, Mar 30, 2023 at 3:29 PM Mathias Picker &lt;<a href="mailto:Mathias.Picker@virtual-earth.de">Mathias.Picker@virtual-earth.de</a>&gt; wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
Cy Schubert &lt;<a href="mailto:Cy.Schubert@cschubert.com" target="_blank">Cy.Schubert@cschubert.com</a>&gt; writes:<br>
<br>
&gt; On Mon, 27 Mar 2023 23:43:35 +0200<br>
&gt; Mathias Picker &lt;<a href="mailto:Mathias.Picker@virtual-earth.de" target="_blank">Mathias.Picker@virtual-earth.de</a>&gt; wrote:<br>
&gt;<br>
&gt;&gt; Am 27. März 2023 23:05:35 MESZ schrieb Cy Schubert <br>
&gt;&gt; &lt;<a href="mailto:Cy.Schubert@cschubert.com" target="_blank">Cy.Schubert@cschubert.com</a>&gt;:<br>
&gt;&gt; &gt;In message <br>
&gt;&gt; &gt;&lt;8b47d0a4-a8f1-1841-ee59-3949fe69cbd7@ShaneWare.Biz&gt;, Shane <br>
&gt;&gt; &gt;Ambler w<br>
&gt;&gt; &gt;rites:  <br>
&gt;&gt; &gt;&gt; On 26/3/23 01:37, Mathias Picker wrote:  <br>
&gt;&gt; &gt;&gt; &gt; <br>
&gt;&gt; &gt;&gt; &gt; Starting sddm works fine, starting my normal session <br>
&gt;&gt; &gt;&gt; &gt; crashes or freezes<br>
&gt;&gt; &gt;&gt; &gt; FreeBSD.<br>
&gt;&gt; &gt;&gt; &gt; <br>
&gt;&gt; &gt;&gt; &gt; I can find no error messages after a reboot.<br>
&gt;&gt; &gt;&gt; &gt; <br>
&gt;&gt; &gt;&gt; &gt; I found out, that I can start xterm or emacs (exwm) <br>
&gt;&gt; &gt;&gt; &gt; without problems,<br>
&gt;&gt; &gt;&gt; &gt; xrandr works with external screen, but once I start <br>
&gt;&gt; &gt;&gt; &gt; anything more<br>
&gt;&gt; &gt;&gt; &gt; demanding (I guess demanding of the GPU) everything <br>
&gt;&gt; &gt;&gt; &gt; freezes or FreeBSD<br>
&gt;&gt; &gt;&gt; &gt; even reboots.<br>
&gt;&gt; &gt;&gt; &gt; <br>
&gt;&gt; &gt;&gt; &gt; “Demanding† means even simple things like <br>
&gt;&gt; &gt;&gt; &gt; qterminal. I tried firefox an  <br>
&gt;&gt; &gt;&gt; d  <br>
&gt;&gt; &gt;&gt; &gt; blender and then I had it with the reboots and <br>
&gt;&gt; &gt;&gt; &gt; didn’t try anything else.<br>
&gt;&gt; &gt;&gt; &gt; xedit works fine :)<br>
&gt;&gt; &gt;&gt; &gt; <br>
&gt;&gt; &gt;&gt; &gt; I have nothing in the logs, I have no idea where to look <br>
&gt;&gt; &gt;&gt; &gt; or how to debug<br>
&gt;&gt; &gt;&gt; &gt; this.<br>
&gt;&gt; &gt;&gt; &gt; <br>
&gt;&gt; &gt;&gt; &gt; Any ideas, tipps, help greatly apreciated.  <br>
&gt;&gt; &gt;&gt;<br>
&gt;&gt; &gt;&gt;<br>
&gt;&gt; &gt;&gt; FreeBSD Developers Handbook Chapter 10: Kernel Debugging<br>
&gt;&gt; &gt;&gt;<br>
&gt;&gt; &gt;&gt; <a href="https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/" rel="noreferrer" target="_blank">https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/</a><br>;
&gt;&gt; &gt;&gt;<br>
&gt;&gt; &gt;&gt; Running stable, kernel dumps may already be enabled, look in <br>
&gt;&gt; &gt;&gt; /var/crash<br>
&gt;&gt; &gt;&gt;<br>
&gt;&gt; &gt;&gt; By enabling a kernel dump when it panics (dumpdev=&quot;AUTO&quot; in <br>
&gt;&gt; &gt;&gt; rc.conf) the<br>
&gt;&gt; &gt;&gt; kernel core is saved to swap space, then on reboot gets <br>
&gt;&gt; &gt;&gt; copied to<br>
&gt;&gt; &gt;&gt; dumpdir (/var/crash) where you can then use kgdb (from <br>
&gt;&gt; &gt;&gt; devel/gdb) to get<br>
&gt;&gt; &gt;&gt; a stack trace to find where the panic happened.  <br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt;drm-*-kmod probably needs a rebuild. Likely a data structure <br>
&gt;&gt; &gt;changed. In my <br>
&gt;&gt; &gt;experience a simple rebuild of the port solves 90% of <br>
&gt;&gt; &gt;drm-*-kmod crash <br>
&gt;&gt; &gt;problems.<br>
&gt;&gt; &gt;  <br>
&gt;&gt; Hi Cy,<br>
&gt;&gt; <br>
&gt;&gt; sorry I didn&#39;t mention that, but I did rebuild drm-kmod, I <br>
&gt;&gt; actually do it after every new kernel build, just to be on the <br>
&gt;&gt; safe side.<br>
&gt;&gt; <br>
&gt;&gt; I switched my swap to non-encrypted and will look if I can get <br>
&gt;&gt; any information from the kernel dump tomorrow.<br>
&gt;&gt; <br>
&gt;&gt; Oh, and it&#39;s on a Thinkpad X1 Yoga 3rd gen, I just noticed I <br>
&gt;&gt; didn&#39;t mention this.<br>
&gt;<br>
&gt; It may be worth trying drm-515-kmod as some MFC that works with <br>
&gt; 515 and<br>
&gt; not 510 may have been committed. Linux-KPI commits are the usual<br>
&gt; suspects.<br>
&gt;<br>
&gt; I use drm-515 with 14-CURRENT.<br>
<br>
Finally I found the time for a kernel crash dump.<br>
This is what kgdb says<br>
<br>
mathiasp:amd64.amd64/sys/GENERIC% sudo kgdb kernel <br>
/var/crash/vmcore.2 <br>
GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]<br>
Copyright (C) 2023 Free Software Foundation, Inc.<br>
License GPLv3+: GNU GPL version 3 or later <br>
&lt;<a href="http://gnu.org/licenses/gpl.html" rel="noreferrer" target="_blank">http://gnu.org/licenses/gpl.html</a>&gt;<br>;
This is free software: you are free to change and redistribute it.<br>
There is NO WARRANTY, to the extent permitted by law.<br>
Type &quot;show copying&quot; and &quot;show warranty&quot; for details.<br>
This GDB was configured as &quot;x86_64-portbld-freebsd13.1&quot;.<br>
Type &quot;show configuration&quot; for configuration details.<br>
For bug reporting instructions, please see:<br>
&lt;<a href="https://www.gnu.org/software/gdb/bugs/" rel="noreferrer" target="_blank">https://www.gnu.org/software/gdb/bugs/</a>&gt;.<br>;
Find the GDB manual and other documentation resources online at:<br>
    &lt;<a href="http://www.gnu.org/software/gdb/documentation/" rel="noreferrer" target="_blank">http://www.gnu.org/software/gdb/documentation/</a>&gt;.<br>;
<br>
For help, type &quot;help&quot;.<br>
Type &quot;apropos word&quot; to search for commands related to &quot;word&quot;...<br>
Reading symbols from kernel...<br>
Reading symbols from <br>
/usr/obj/usr/src/amd64.amd64/sys/GENERIC/kernel.debug...<br>
<br>
Unread portion of the kernel message buffer:<br>
<br>
<br>
__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55<br>
55              __asm(&quot;movq %%gs:%P1,%0&quot; : &quot;=r&quot; (td) : &quot;n&quot; <br>
(offsetof(struct pcpu,<br>
(kgdb) backtrace<br>
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55<br>
#1  doadump (textdump=&lt;optimized out&gt;) at <br>
 /usr/src/sys/kern/kern_shutdown.c:396<br>
#2  0xffffffff80c07c2a in kern_reboot (howto=260) at <br>
 /usr/src/sys/kern/kern_shutdown.c:484<br>
#3  0xffffffff80c080ce in vpanic (fmt=&lt;optimized out&gt;, <br>
 ap=ap@entry=0xfffffe01341fab50) at <br>
 /usr/src/sys/kern/kern_shutdown.c:923<br>
#4  0xffffffff80c07f03 in panic (fmt=&lt;unavailable&gt;) at <br>
 /usr/src/sys/kern/kern_shutdown.c:847<br>
#5  0xffffffff810c1fa7 in trap_fatal (frame=0xfffffe01341fac40, <br>
 eva=0) at /usr/src/sys/amd64/amd64/trap.c:942<br>
#6  0xffffffff810c1fff in trap_pfault (frame=0xfffffe01341fac40, <br>
 usermode=false, signo=&lt;optimized out&gt;, ucode=&lt;optimized out&gt;)<br>
    at /usr/src/sys/amd64/amd64/trap.c:761<br>
#7  &lt;signal handler called&gt;<br>
#8  0xffffffff84a07067 in shmem_get_pages () from <br>
 /boot/modules/i915kms.ko<br>
#9  0x0000000300000015 in ?? ()<br>
#10 0x0000000000000060 in ?? ()<br>
#11 0x0000000000000060 in ?? ()<br>
#12 0x0000000000060000 in ?? ()<br>
#13 0xfffffe00dc365a80 in ?? ()<br>
#14 0xfffff00100000060 in ?? ()<br>
#15 0xfffff8003e270c00 in ?? ()<br>
#16 0x00000000fffff000 in ?? ()<br>
#17 0xfffff8002138fc20 in ?? ()<br>
#18 0xfffffe00dc365a80 in ?? ()<br>
#19 0x0000000000000060 in ?? ()<br>
#20 0xfffff8003e270c00 in ?? ()<br>
#21 0x0000000000000060 in ?? ()<br>
#22 0xfffffe0131e0fc80 in ?? ()<br>
#23 0xfffffe01341fade0 in ?? ()<br>
#24 0xffffffff84a07596 in shmem_pwrite () from <br>
 /boot/modules/i915kms.ko<br>
#25 0x0000000000000000 in ?? ()<br>
(kgdb) <br>
<br>
<br>
Anything else I can do to help?<br>
<br>
I’m now building drm-515-kmod, let’s see how that works in <br>
-stable.<br>
<br>
/Mathias<br><br></blockquote><div><br></div><div>Any updates here? I just ran into this myself and am very close to just installing Linux on my laptop, tbh.</div><div><br></div><div>I&#39;ve rebuilt stable/13 today, then rebuilt the 510-kmod (because the 515-kmod doesn&#39;t even build) and pretty much anything that&#39;s not an XTerm will panic/reboot the machine (a Thinkpad T490 with Intel GPU). </div><div><br></div><div>dmesg got this to say:</div><div><br></div><div>Fatal trap 12: page fault while in kernel mode<br>cpuid = 1; apic id = 02<br>fault virtual address   = 0x0<br>fault code              = supervisor read data, page not present<br>instruction pointer     = 0x20:0xffffffff84430626<br>stack pointer           = 0x28:0xfffffe0140c83cf0<br>frame pointer           = 0x28:0xfffffe0140c83d70<br>code segment            = base 0x0, limit 0xfffff, type 0x1b<br>                        = DPL 0, pres 1, long 1, def32 0, gran 1<br>processor eflags        = interrupt enabled, resume, IOPL = 0<br>current process         = 0 (i915-userptr-acquir)<br>trap number             = 12<br>panic: page fault<br>cpuid = 1<br>time = 1681221523<br>KDB: stack backtrace:<br>#0 0xffffffff80c5fc15 at kdb_backtrace+0x65<br>#1 0xffffffff80c12e02 at vpanic+0x152<br>#2 0xffffffff80c12ca3 at panic+0x43<br>#3 0xffffffff810d1577 at trap_fatal+0x387<br>#4 0xffffffff810d15cf at trap_pfault+0x4f<br>#5 0xffffffff810a8568 at calltrap+0x8<br>#6 0xffffffff84430c02 at __i915_gem_userptr_get_pages_worker+0x1f2<br>#7 0xffffffff80e80883 at linux_work_fn+0xe3<br>#8 0xffffffff80c746f1 at taskqueue_run_locked+0x181<br>#9 0xffffffff80c759b3 at taskqueue_thread_loop+0xc3<br>#10 0xffffffff80bcf55d at fork_exit+0x7d<br>#11 0xffffffff810a95de at fork_trampoline+0xe<br></div><div><br></div><div>It apparently dumps core, will have to reacquaint myself with how to poke at this some more...</div></div></div>
home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ9axoSeFLHcXoUir%2BYmnGtCuo8hy6E_D9Fm8gAPOFBAV_zsDg>