Date: Wed, 26 Apr 2023 23:24:29 +0000 From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko Message-ID: <bug-267028-3630-O90aROuen1@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-267028-3630@https.bugs.freebsd.org/bugzilla/> References: <bug-267028-3630@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D267028 --- Comment #172 from Mark Millard <marklmi26-fbsd@yahoo.com> --- One of the things that makes this hard to analyze is that the first failure quickly leads to other failures and most of the evidence for for the later failure. For example, in the following note that original trap number is 12 but the backtrace is for/after a later trap, of type-number 22 instead. There is very little information directly about the original trap type-number 12: Fatal trap 12: page fault while in kernel mode cpuid =3D 0; apic id =3D 00 fault virtual address =3D 0x0 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80bf3727 stack pointer =3D 0x28:0xfffffe000e1a7ba0 frame pointer =3D 0x28:0xfffffe000e1a7bd0 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 1 (init) trap number =3D 12 WARNING !drm_modeset_is_locked(&crtc->mutex) failed at /usr/ports/graphics/drm-510-kmod/work/drm-kmod-drm_v5.10.163_4/drivers/gpu/= drm/drm_atomic_helper.c:619 . . . WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/ports/graphics/drm-510-kmod/work/drm-kmod-drm_v5.10.163_4/drivers/gpu/= drm/drm_atomic_helper.c:894 kernel trap 22 with interrupts disabled kernel trap 22 with interrupts disabled panic: page fault cpuid =3D 0 time =3D 1682435560 KDB: stack backtrace: #0 0xffffffff80c66ee5 at kdb_backtrace+0x65 #1 0xffffffff80c1bbef at vpanic+0x17f #2 0xffffffff80c1ba63 at panic+0x43 #3 0xffffffff810addf5 at trap_fatal+0x385 #4 0xffffffff810ade4f at trap_pfault+0x4f #5 0xffffffff81084fd8 at calltrap+0x8 #6 0xffffffff8261d251 at spl_nvlist_free+0x61 #7 0xffffffff826dd740 at fm_nvlist_destroy+0x20 #8 0xffffffff827b6e95 at zfs_zevent_post_cb+0x15 #9 0xffffffff826dcd02 at zfs_zevent_drain+0x62 #10 0xffffffff826dcbf8 at zfs_zevent_drain_all+0x58 #11 0xffffffff826dede9 at fm_fini+0x19 #12 0xffffffff82713b94 at spa_fini+0x54 #13 0xffffffff827be303 at zfs_kmod_fini+0x33 #14 0xffffffff8262fb3b at zfs_shutdown+0x2b #15 0xffffffff80c1b76c at kern_reboot+0x3dc #16 0xffffffff80c1b381 at sys_reboot+0x411 #17 0xffffffff810ae6ec at amd64_syscall+0x10c . . . The primary hint about what code execution context lead to the original instance of trap type 12 above is basically: instruction pointer =3D 0x20:0xffffffff80bf3727 amdgpu does not leave in place a clean context for debugging kernel crashes. Trying to keep the video context operational for a kernel that has crashed, while not messing up the analysis context for the original problem is problematical. My guess would be that normal analysis of such tries to have the problem occur in a virtual machine sort of context where another (outer) context is available that is independent and can look at the details from outside the failing context. But even that would require the failing context in the VM to stop before amdgpu or the like messed up the evidence in the VM. (Not that I've ever done that type of evidence gathering.) --=20 You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-267028-3630-O90aROuen1>