From nobody Sat Jun 6 00:16:43 2026 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4gXJl520bkz6gnQk for ; Sat, 06 Jun 2026 00:16:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R13" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4gXJl51W2jz3jbk for ; Sat, 06 Jun 2026 00:16:49 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1780705009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=/kKJmxfDR9CfURKqlbA+/692iQB9pQAHJXomEE9gQ3g=; b=isxIw3/bFp/t96AZq0D1Y8S87BEvIyr9DeYXJ9fbr0yl5mz65/PpB6wSB+Nf4yTDCDT/dU xEdD96OydQw0fQkoxrBANurLq9myx+LBSlEnXZg+Z4kTrv3CcLv3dsgdZbh650d1LENocE Uy8Ec+jSrcymCFEQdy6bS7/0XQ+cuSB9rbYoOV9gU2EogunIs7kRPIWh4VFCdcB6c6D58i B1deviw/Aw+QR91WeHquW4E9XlBifJJTqDSIIGieLHYD3oOdh3nc1Gd/FkXOotzy9Uz1px v6iyRwd9MA2aGveu7KjvyTbJC5yJart80wgUVFSMp/dBIzyk/iWdW/p8E0R+8w== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1780705009; a=rsa-sha256; cv=none; b=AeMbAvofDkvF1Jo+kShnxVl0xR5R7v19aDeQaAn2UWdZ3LMoZCr8FpB6FuRyBSz2CdTmGK ZuMHpbVfq8qHQfDI221VzKaguaJ6LY41nArM11uS7g/wvX026CLUxOntfQli94VQShGlvV xmY3N4gVqC3T+e38Sjsn6A8i9IrYCt0iXkE3SsnOg1Gr+o8tM5aw/vYlpo3c2LWkFOuUD2 Lr0NV2NMH5ahVosOSi+XlB40un2gVG/nl6CcA5LG02ou92KfekmhOmEwtuaNrnt99xoX20 5qBbqEiGyCqgDb4+jo6nBB3fyBRDph7+llsTF0M0d/8Ck047+VLE4BJb95nttQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1780705009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=/kKJmxfDR9CfURKqlbA+/692iQB9pQAHJXomEE9gQ3g=; b=P+l8/0234fxtC7nrffMwn7BVKRRZX9DOpfVgk8pwuzhbW5jVcNwMTlqtvlUUGfj6Kx7P/3 6KpihCav65b8ysSfTzzOBBeJBehaltPdDJp2UES0DMt82KfXviiG20SElrkNlOXyC4Eyvk GpdGOOoRFTavETBPgP0fNBO+E8xVIS682O/d09D+tABwv+HgFTZUzKPBS3oRexe3rleNo+ nXBOp5cb0RoW/iHYRdFF4HzCRB+FEz4c1FSaF4q5ru7ZliacjttiMP/iE/cAjbLFlUjutf wCWu4pNqIs7TKSfWDOfPejcEXTaqwAmewp6CgHMgYVebX+iLDrb7LR6/dnj29A== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) by mxrelay.nyi.freebsd.org (Postfix) with ESMTP id 4gXJl50pPKzskJ for ; Sat, 06 Jun 2026 00:16:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from git (uid 1279) (envelope-from git@FreeBSD.org) id 3262e by gitrepo.freebsd.org (DragonFly Mail Agent v0.13+ on gitrepo.freebsd.org); Sat, 06 Jun 2026 00:16:43 +0000 To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Andrew Gallatin Subject: git: 16e5abf415ba - main - APEI: Provide more info on fatal hardware errors List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org List-Id: List-Post: List-Help: List-Subscribe: List-Unsubscribe: List-Owner: Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: gallatin X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 16e5abf415baf801c6d7c7948a742aeda75e2237 Auto-Submitted: auto-generated Date: Sat, 06 Jun 2026 00:16:43 +0000 Message-Id: <6a2366eb.3262e.24229ff0@gitrepo.freebsd.org> The branch main has been updated by gallatin: URL: https://cgit.FreeBSD.org/src/commit/?id=16e5abf415baf801c6d7c7948a742aeda75e2237 commit 16e5abf415baf801c6d7c7948a742aeda75e2237 Author: Andrew Gallatin AuthorDate: 2026-06-06 00:07:03 +0000 Commit: Andrew Gallatin CommitDate: 2026-06-06 00:12:21 +0000 APEI: Provide more info on fatal hardware errors This change refactors fatal error delivery via APEI and prints more info: - Makes the NMI handler call into the ge handler to establish a common code flow, no matter how the error is delivered - Adds the FRU to the panic string so as to provide more information than just "APEI Fatal Hardware Error!" such as "APEI Fatal Hardware Error: PcieError" - Prints more details about fatal pcie errors. Note that we skip acquiring Giant on fatal errors - Hexdumps the full GED data on fatal errors, so as to facilitate offline data analysis Reviewed by: imp Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D57417 --- sys/dev/acpica/acpi_apei.c | 53 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 37 insertions(+), 16 deletions(-) diff --git a/sys/dev/acpica/acpi_apei.c b/sys/dev/acpica/acpi_apei.c index e85b3910e46d..925558d585bf 100644 --- a/sys/dev/acpica/acpi_apei.c +++ b/sys/dev/acpica/acpi_apei.c @@ -237,7 +237,7 @@ apei_mem_handler(ACPI_HEST_GENERIC_DATA *ged) } static int -apei_pcie_handler(ACPI_HEST_GENERIC_DATA *ged) +apei_pcie_handler(ACPI_HEST_GENERIC_DATA *ged, bool fatal) { struct apei_pcie_error *p = (struct apei_pcie_error *)GED_DATA(ged); int off; @@ -246,7 +246,8 @@ apei_pcie_handler(ACPI_HEST_GENERIC_DATA *ged) int h = 0, sev; if ((p->ValidationBits & 0x8) == 0x8) { - mtx_lock(&Giant); + if (!fatal) + mtx_lock(&Giant); dev = pci_find_dbsf((uint32_t)p->DeviceID[10] << 8 | p->DeviceID[9], p->DeviceID[11], p->DeviceID[8], p->DeviceID[7]); @@ -264,9 +265,11 @@ apei_pcie_handler(ACPI_HEST_GENERIC_DATA *ged) } pcie_apei_error(dev, sev, (p->ValidationBits & 0x80) ? p->AERInfo : NULL); - h = 1; + if (!fatal) + h = 1; } - mtx_unlock(&Giant); + if (!fatal) + mtx_unlock(&Giant); } if (h) return (h); @@ -322,8 +325,8 @@ apei_pcie_handler(ACPI_HEST_GENERIC_DATA *ged) return (0); } -static void -apei_ged_handler(ACPI_HEST_GENERIC_DATA *ged) +static const char * +apei_ged_handler(ACPI_HEST_GENERIC_DATA *ged, bool fatal) { ACPI_HEST_GENERIC_DATA_V300 *ged3 = (ACPI_HEST_GENERIC_DATA_V300 *)ged; /* A5BC1114-6F64-4EDE-B863-3E83ED7C83B1 */ @@ -342,12 +345,12 @@ apei_ged_handler(ACPI_HEST_GENERIC_DATA *ged) if (memcmp(mem_uuid, ged->SectionType, ACPI_UUID_LENGTH) == 0) { h = apei_mem_handler(ged); } else if (memcmp(pcie_uuid, ged->SectionType, ACPI_UUID_LENGTH) == 0) { - h = apei_pcie_handler(ged); + h = apei_pcie_handler(ged, fatal); } else { if (!log_corrected && (ged->ErrorSeverity == ACPI_HEST_GEN_ERROR_CORRECTED || ged->ErrorSeverity == ACPI_HEST_GEN_ERROR_NONE)) - return; + return (NULL); t = ged->SectionType; printf("APEI %s Error %02x%02x%02x%02x-%02x%02x-" @@ -364,7 +367,7 @@ apei_ged_handler(ACPI_HEST_GENERIC_DATA *ged) } } if (h) - return; + return (NULL); printf(" Flags: 0x%x\n", ged->Flags); if (ged->ValidationBits & ACPI_HEST_GEN_VALID_FRU_ID) { @@ -379,6 +382,19 @@ apei_ged_handler(ACPI_HEST_GENERIC_DATA *ged) if (ged->Revision >= 0x300 && ged->ValidationBits & ACPI_HEST_GEN_VALID_TIMESTAMP) printf(" Timestamp: %016jx\n", ged3->TimeStamp); + if (fatal) { + printf(" Error Data:\n"); + t = (uint8_t *)GED_DATA(ged); + for (off = 0; off < ged->ErrorDataLength; off++) { + printf(" %02x", t[off]); + if ((off % 16) == 15 || + off + 1 == ged->ErrorDataLength) + printf("\n"); + } + } + if (ged->ValidationBits & ACPI_HEST_GEN_VALID_FRU_STRING) + return ((const char *)ged->FruText); + return (NULL); } static int @@ -387,23 +403,27 @@ apei_ge_handler(struct apei_ge *ge, bool copy) uint8_t *buf = copy ? ge->copybuf : ge->buf; ACPI_HEST_GENERIC_STATUS *ges = (ACPI_HEST_GENERIC_STATUS *)buf; ACPI_HEST_GENERIC_DATA *ged; + const char *fru, *f; size_t off, len; - uint32_t sev; int i, c; + bool fatal; if (ges == NULL || ges->BlockStatus == 0) return (0); c = (ges->BlockStatus >> 4) & 0x3ff; - sev = ges->ErrorSeverity; + fatal = (ges->ErrorSeverity == ACPI_HEST_GEN_ERROR_FATAL); /* Process error entries. */ + fru = NULL; len = MIN(ge->v1.ErrorBlockLength - sizeof(*ges), ges->DataLength); for (off = i = 0; i < c && off + sizeof(*ged) <= len; i++) { ged = (ACPI_HEST_GENERIC_DATA *)&buf[sizeof(*ges) + off]; if ((uint64_t)GED_SIZE(ged) + ged->ErrorDataLength > len - off) break; - apei_ged_handler(ged); + f = apei_ged_handler(ged, fatal); + if (f != NULL && fru == NULL) + fru = f; off += GED_SIZE(ged) + ged->ErrorDataLength; } @@ -418,8 +438,9 @@ apei_ge_handler(struct apei_ge *ge, bool copy) } /* If ACPI told the error is fatal -- make it so. */ - if (sev == ACPI_HEST_GEN_ERROR_FATAL) - panic("APEI Fatal Hardware Error!"); + if (fatal) + panic("APEI Fatal Hardware Error: %.20s", + fru != NULL ? fru : "unknown"); return (1); } @@ -450,9 +471,9 @@ apei_nmi_handler(void) if (ges == NULL || ges->BlockStatus == 0) continue; - /* If ACPI told the error is fatal -- make it so. */ + /* Log and panic via apei_ge_handler(); does not return. */ if (ges->ErrorSeverity == ACPI_HEST_GEN_ERROR_FATAL) - panic("APEI Fatal Hardware Error!"); + apei_ge_handler(ge, false); /* Copy the buffer for later processing. */ gesc = (ACPI_HEST_GENERIC_STATUS *)ge->copybuf;