From nobody Tue Mar 21 01:07:55 2023 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PgYPR53Zxz40WyD for ; Tue, 21 Mar 2023 01:07:55 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PgYPR43XPz4HD6 for ; Tue, 21 Mar 2023 01:07:55 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1679360875; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h6PC7zQzHtD17ZS98bFrLk4TP5gsqWVC/m7q0nAnqks=; b=TDHrYY6o+1eZtum/37vQhHUPYiKR2OZK7d7V5DLdI35C1xKbIX3ABH7sFSG742LoDF1OnY p0FTpgt5t/iY/9tOMaJ0rH5Od2zP0sfZmihKIYSvxk6TNA+/CgaU4VmqtvvAoC+CI5OAG2 AyjNI3pxMrJcbWioyBsnNJBW11AmGF2Q+YWvy3m1nWMLZxvoVFlcPDgFfpaj4IJeMWcazd ZYyVgRGu8KnZpCf+qRuRVxHPy6S1FqDZK99V/bCIBJ4zRNuHjLpjO9KYxKb85K0s2a7LPv v84SG+dutzq6EAkxFEImHp7DAYOgmwrIe265sBgfwEP8YZdNEtbA4SBcCScO+w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1679360875; a=rsa-sha256; cv=none; b=wzxhmzmL/gLvdtZnxkBBrx8DSuJOF+8BrLFqX6E0vzBDXyJWUYWhkxh2JR45gL7pnYE5O6 ApvzrJ3OXJiABAsLCXTS5+L1vaCF4jPAYLvDyDv4J8GKC5PysRGqXf41idFdirW55l0DyC xBIqQFb/0kSLNFqbc9gM05HtFeswjxGFok7IZJU52uL5R+KWjbLpF8x3UU/tZXO0qdzVe1 +9/wQjssw3XG0bEgKpxxhe2cWuaiWWRNka7fYQJVcafdFUW3NSNvMBs+7Xn2nTiHBa/hBZ hTKr/k8vubFHLP8ZJHzIMSm9WlrNJgA6DQiEEiJs5ovuZdwWS+08y+cWgO9wdg== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4PgYPR325ZzsF3 for ; Tue, 21 Mar 2023 01:07:55 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 32L17tjo097963 for ; Tue, 21 Mar 2023 01:07:55 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 32L17t8W097962 for fs@FreeBSD.org; Tue, 21 Mar 2023 01:07:55 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko Date: Tue, 21 Mar 2023 01:07:55 +0000 X-Bugzilla-Reason: AssignedTo CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 13.1-RELEASE X-Bugzilla-Keywords: crash, needs-qa X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: marklmi26-fbsd@yahoo.com X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: maintainer-feedback? maintainer-feedback? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D267028 --- Comment #145 from Mark Millard --- (In reply to George Mitchell from comment #144) Looking at your full list of attachments, it appears that . . . All the shutdown time crashes have: fault virtual address =3D 0x0 (And we might now have a known type of context for getting the type of failure: late amdgpu but no XFCE.) All the dbuf_evict_thread related crashes have: fault virtual address =3D 0x7 (Late admgpu but having used XFCE.) All the kldload related crashes have: Fatal trap 9: general protection fault while in kernel mode (but no explicit fault address listed) (Early amdgpu loading.) My guess is something is trashing memory in a way that involves writing zeros over some pointer values that it should not be touching. Later code extracts such zeros and applies any offset and then tries to dereference the result, resulting in a crash. That you got "fault virtual address =3D 0x0" for shutdown without having involved XFCE, suggests that a problem is already in place before XFCE is potentially involved: XFCE is not required. (XFCE use might lead to more trashed memory than otherwise, leading to the 0x7 fault address cases.) But I do not see how to get solid evidence for or against such the hypothesis (or related ones). The only thing I can identify that is likely unique to your context --but is involved with amdgpu-- is the involvement of the amdgpu_raven_gpu_*.ko modules. Unfortunately moving your context to a different system that avoids such module use or finding someone with a separate system that does have such (and is willing to set up experiments), is non-trivial for both directions of testing. Beyond possibly some checking on the degree/ease of repeatability, I do not see how to gather better information, much less get anywhere near directly actionable information for fixing the crashes. The one thing we have not looked at is the crash dumps themselves, examining what memory looks like and such. But I do not know what to do for that either, relative to known-useful information. Such a direction would be very exploratory and likely very time consuming. --=20 You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug.=