From nobody Sun Dec 15 22:01:19 2024 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YBH8d39T4z5hN4q for ; Sun, 15 Dec 2024 22:01:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YBH8d1Nvrz4m29 for ; Sun, 15 Dec 2024 22:01:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1734300081; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=G9ORU61h7swBD+MUlcrndAN4bEq15jGCQXGWRU5cikU=; b=p8XWlh1eBKcKvRDB0ph9i7Idehz5tLmcEJk5E6+iP8AwgzVEt4AkA+3/s38qMGZHCCg2ph MylhqLXDm2h9nwNQypZgChfCIHdoiStZ/75lbDEyLH51H7i4QaGKcEKqjM/1gDlqxmcHIf /iFZe/nNERDeRNun+j0d9i7eD6ryC+RoJHlpQS9ql078/6taxJM0uWgav8TLbq3TNbmYdE a5mNU0Wp+25S7mcwnVPjCXnwZOLh9X54cTm7jgwDFmQ42Lhd0vyl6LSMpUDJWoYGqfRSOw +lZLthqu1PNelf2+o9o7aaQzq6Bt4kzxpJrg5zCc1ztyYPPbtu0S48fHGVEMtQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1734300081; a=rsa-sha256; cv=none; b=vOyUQl4uuUy5jMREd7wYO+AcUt7tZebcwLphK7NWGh0KcgjwS33dTPWTb6shZc/oDwimlF nrRZQyzGdmKmrEVQb9qwmHeGbplBVJou3g2CRGW0Ktt+KvbVsOyNMnfcYYgiUu0LIKv2tr dIhmOe0fnVuqHAilom/1CB4O1xS0uFOhEG9GKVrDmj3QZX6FwLKZMscoQhy06LYDyve8n7 CJ1pZxpg5t3PheZpx05sLZNwaNTdlW2sDB53c7vYAdo680FddKxmrWmosmJd/QYWChIeJ8 2tV0EityusFSW3kr1tfiSy1pTrVRLY8rDmfWnedY1s5Qlp2s5elQfRgOqBjCCQ== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4YBH8d0ybMzQW9 for ; Sun, 15 Dec 2024 22:01:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 4BFM1KWw040496 for ; Sun, 15 Dec 2024 22:01:20 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 4BFM1Kk4040495 for bugs@FreeBSD.org; Sun, 15 Dec 2024 22:01:20 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko Date: Sun, 15 Dec 2024 22:01:19 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 13.1-RELEASE X-Bugzilla-Keywords: crash, needs-qa X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: marklmi26-fbsd@yahoo.com X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: maintainer-feedback? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@FreeBSD.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D267028 --- Comment #235 from Mark Millard --- For the 3 node sequence (last partially-good and then just-junk): $208 =3D {link =3D {tqe_next =3D 0xfffff80004607a00, tqe_prev =3D 0xfffff80= 00465bc80}, container =3D 0xfffff80003868c00, name =3D 0xffffffff82e1e000 "amdgpu_raven_mec_bin_fw",=20 version =3D 1} $209 =3D {link =3D {tqe_next =3D 0xfffff80000000007, tqe_prev =3D 0xfffff80= 00465bbc0}, container =3D 0xfffff80004b29600, name =3D 0xffffffff82e62026 "amdgpu_raven_mec2_bin_fw", version =3D 1} $210 =3D {link =3D {tqe_next =3D 0xeef3f000e2c3f0, tqe_prev =3D 0xff54f000e= ef3f0}, container =3D 0x322ff0003287f0, name =3D 0xe987f000fea5f0 ,=20 version =3D 15660016} it looks like the: $209 =3D {link =3D {tqe_next =3D 0xfffff80000000007, is the earliest example of (evidence of) corruption. The address is outside of (smaller address than) the kernel start: Local exec file: `/usr/home/root/failing-kernel-files/boot/kernel/kernel', file type elf64-x86-64-freebsd. Entry point: 0xffffffff8038e000 0xffffffff802002a8 - 0xffffffff802002b5 is .interp Having 0000000007 also looks odd. However, the rest of that node: tqe_prev =3D 0xfffff8000465bbc0}, container =3D 0xfffff80004b29600, name =3D 0xffffffff82e62026 "amdgpu_raven_mec2_bin_fw", version =3D 1} does not look to have any obvious problems with its content. The contents of the container are shown as: $214 =3D {ops =3D 0xfffff80003164000, refs =3D 1, userrefs =3D 0, flags =3D= 1, link =3D {tqe_next =3D 0xfffff8000469ed80, tqe_prev =3D 0xfffff80003868c18}, filenam= e =3D 0xfffff80004b22120 "amdgpu_raven_mec2_bin.ko",=20 pathname =3D 0xfffff80004607a40 "/boot/modules/amdgpu_raven_mec2_bin.ko",= id =3D 20, address =3D 0xffffffff82e61000 "\203\376\001tL\270= \026", size =3D 276456, ctors_addr =3D 0x0,=20 ctors_size =3D 0, dtors_addr =3D 0x0, dtors_size =3D 0, ndeps =3D 3, deps= =3D 0xfffff80004b220e0, common =3D {stqh_first =3D 0x0, stqh_last =3D 0xfffff80004b29680}, modules =3D {tqh_first =3D 0xfffff80004b1ff00,=20 tqh_last =3D 0xfffff80004b1ff10}, loaded =3D {tqe_next =3D 0x0, tqe_pre= v =3D 0x0}, loadcnt =3D 20, nenabled =3D 0, fbt_nentries =3D 0} which also seems to not have obvious problems. The type of vmcore.* does not provide threads, stack content, or backtrace information. Nor is there any indication of any detailed point for when the tqe_next =3D 0xfffff80000000007 became the case. It is not necessarily obvious if the list was longer before the 0xfffff80000000007 became the case. There does not seem to be a way to tell if the corrupted value might be becuase of "raven" specific code vs. more general code. It would be interesting to know if an alternate card type has the problem vs. not. As for the raven context, getting vmcore.* captures that fail at a different stage, such as the failure that mentioned acpi_wmi but did not get a vmcore.* , would help indicate if where the corruption happens in the list moves around (relative to other content). --=20 You are receiving this mail because: You are the assignee for the bug.=