Date: Tue, 21 Mar 2023 01:07:55 +0000 From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko Message-ID: <bug-267028-3630-wqWLfyNGcI@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-267028-3630@https.bugs.freebsd.org/bugzilla/> References: <bug-267028-3630@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D267028 --- Comment #145 from Mark Millard <marklmi26-fbsd@yahoo.com> --- (In reply to George Mitchell from comment #144) Looking at your full list of attachments, it appears that . . . All the shutdown time crashes have: fault virtual address =3D 0x0 (And we might now have a known type of context for getting the type of failure: late amdgpu but no XFCE.) All the dbuf_evict_thread related crashes have: fault virtual address =3D 0x7 (Late admgpu but having used XFCE.) All the kldload related crashes have: Fatal trap 9: general protection fault while in kernel mode (but no explicit fault address listed) (Early amdgpu loading.) My guess is something is trashing memory in a way that involves writing zeros over some pointer values that it should not be touching. Later code extracts such zeros and applies any offset and then tries to dereference the result, resulting in a crash. That you got "fault virtual address =3D 0x0" for shutdown without having involved XFCE, suggests that a problem is already in place before XFCE is potentially involved: XFCE is not required. (XFCE use might lead to more trashed memory than otherwise, leading to the 0x7 fault address cases.) But I do not see how to get solid evidence for or against such the hypothesis (or related ones). The only thing I can identify that is likely unique to your context --but is involved with amdgpu-- is the involvement of the amdgpu_raven_gpu_*.ko modules. Unfortunately moving your context to a different system that avoids such module use or finding someone with a separate system that does have such (and is willing to set up experiments), is non-trivial for both directions of testing. Beyond possibly some checking on the degree/ease of repeatability, I do not see how to gather better information, much less get anywhere near directly actionable information for fixing the crashes. The one thing we have not looked at is the crash dumps themselves, examining what memory looks like and such. But I do not know what to do for that either, relative to known-useful information. Such a direction would be very exploratory and likely very time consuming. --=20 You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-267028-3630-wqWLfyNGcI>