From nobody Mon Apr 6 10:03:43 2026 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4fq4fR5mdjz6Z8Xp for ; Mon, 06 Apr 2026 10:03:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R12" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4fq4fR55FYz3JD4 for ; Mon, 06 Apr 2026 10:03:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1775469823; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Knkwqt5/6PArFPBWWlD/twc7xnprzRgXOBRyRdFVu5M=; b=I6E/Ac0bBV3WHug9P58FJmy1568tc/HqEJ9xGhh5IEIHto4i+B5gle7Aw3zqWFvQJkyKku yxPAdvVahbcd6QaSzk0O/7GGRw3M1s0EdLsU8C9cRwtAJiqmyJZruXVt7JFkugN8tOUAf0 jxihqKSZkFxqt/jOp1tn5tAA/Zza3YjH2PfhavnFPjLNJh8cmUXM3l/LPKSwRqWzxCovzg ow3VFNqxBKpGZWYRcGyD2vpaEQ/6Z1ezSLrQGnDaZs8cdfqIDDBd7QUCTo9QdJODUDSHHS ksPq9c1CMIp20CyBNd3kD4kH4ur5n6lvBPgxTAGTOflVgg0r67TtoT/9yonJHQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1775469823; a=rsa-sha256; cv=none; b=bXRQIyZbL+dODkyTqMbZTfAY4Qgb9s+njIhdJCbQo7L25SGugf6xXg630MdJUSyyQra+Xc 3mnJjzOwNnu8gG7HOC/b0r5mV5PtdTEczvIbMqqoymEH9aCUB34cxboj3po8nDd62Itnj8 3S0GkioXeVzbVFMGdHuAoeROSSBpoSrH3dCWd8ZfK8HvL5DjCgfhMCUIC+alp2F1Ei0xuz obZFndvoOzi8irIrjsqn0bSs75TaYehgioOCQ35HLDr8QXSJToOO8X3GkI9LRC/MIoYSsM 2aX40qfhYB2sCqkUVSPxXZ37nOJ3c5PM269CQFTA9gCOpFCY0+oS7HGfDSA7UQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1775469823; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Knkwqt5/6PArFPBWWlD/twc7xnprzRgXOBRyRdFVu5M=; b=PCwXNOcyc0jSoH/bNu56gnwy5QCDYn+KeIUf/kgYp7QtOiC4tFAXyfwK4Mvl/Zsebv/iBA sroP3N6M9gV+B3qYxNtBCa7lZrXweXxANRm94Skj8JVXEMw49BZ5s0u/xGKStaUQkHqgBP uE70v/ZUnjsvByzeoYkNFUvh5lTR5xD7I+tNIzgzqukG9I14yBJE5tdA9bbUXhP8f9w6C7 sA+8D/HlezqAFLg9oSaJ3a7sLhrbhDQXiukwfCA2BPecVrFKs9emJvQXQ0IrLzBanavGi6 aCbnfjUFNGwfbyjDeL3fnQJs2m0T9zcWt1tm6eRJobUBmDJDxQafr7jxWYV5lA== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4fq4fR4hQqzphp for ; Mon, 06 Apr 2026 10:03:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 636A3hZM027127 for ; Mon, 6 Apr 2026 10:03:43 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 636A3htp027126 for bugs@FreeBSD.org; Mon, 6 Apr 2026 10:03:43 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 294280] mrsas: crash dump collection fails on RAID1 VD behind controller; firmware reports invalidSgl=1 on dump-time WRITE(10) (64KB) Date: Mon, 06 Apr 2026 10:03:43 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 15.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: chandrakanth.patil@broadcom.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@FreeBSD.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D294280 Bug ID: 294280 Summary: mrsas: crash dump collection fails on RAID1 VD behind controller; firmware reports invalidSgl=3D1 on dump-time WRITE(10) (64KB) Product: Base System Version: 15.0-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: chandrakanth.patil@broadcom.com Crash dump collection fails on a system where the OS is installed on a RAID1 virtual disk exposed by the mrsas controller. During a manual panic, dump I= /O fails after repeated write errors. Controller / firmware-side logs report invalidSgl=3D1 for the failing WRITE(10) request. The failure is reproducible only during panic dump collection so far. Normal runtime I/O to the same VD has not shown a visible failure in our current testing. When a manual crash is triggered, the kernel starts writing the dump and th= en aborts with I/O error: panic: vm_fault_lookup: fault on nofault entry, addr: 0xffffffff8216d000 cpuid =3D 8 time =3D 1771836706 KDB: stack backtrace: #0 0xffffffff80bbe1ed at kdb_backtrace+0x5d #1 0xffffffff80b71576 at vpanic+0x136 #2 0xffffffff80b71433 at panic+0x43 #3 0xffffffff80f07a9b at vm_fault+0x17cb #4 0xffffffff80f061e1 at vm_fault_trap+0x81 #5 0xffffffff81079d99 at trap_pfault+0x1f9 #6 0xffffffff8104ff18 at calltrap+0x8 #7 0xffffffff80bbf59e at kobj_init+0xe #8 0xffffffff80babe63 at device_set_driver+0xa3 #9 0xffffffff80babb24 at device_probe_child+0xc4 #10 0xffffffff80baccd1 at device_probe+0x71 #11 0xffffffff80bace8e at device_probe_and_attach+0xe #12 0xffffffff8083d362 at pci_driver_added+0xf2 #13 0xffffffff80baa8c9 at devclass_driver_added+0x29 #14 0xffffffff80baa85e at devclass_add_driver+0x11e #15 0xffffffff80b4b575 at module_register_init+0x85 #16 0xffffffff80b3c0df at linker_load_module+0xc0f #17 0xffffffff80b3dcd5 at kern_kldload+0x165 Uptime: 6m33s Dumping 4071 out of 130287 MB: mrsas0: FW cmd complete status 3c (da2:mrsas0:0:3:0): WRITE(10). CDB: 2a 00 2b 49 49 d7 00 00 80 00 (da2:mrsas0:0:3:0): CAM status: CCB request completed with an error (da2:mrsas0:0:3:0): Error 5, Retries exhausted Aborting dump due to I/O error. ** DUMP FAILED (ERROR 5) ** Controller / firmware log Controller / firmware-side analysis reports the following for the same fail= ing command: 12/23/25 9:08:48.929: C0:LdCmdValidateLdIo: ld:0 Data length 10000 invalid= Sgl 1 for Read/write IO with CDB 2a The failing CDB is: 2a 00 2b 49 49 d7 00 00 80 00 This is a WRITE(10) request with transfer length 0x0080 blocks. For a 512-b= yte block device, that is 0x80 * 512 =3D 0x10000 bytes, i.e. 64KB. So the trans= fer size itself appears consistent with the CDB. The failure being reported by firmware is specifically that the SGL attached to this host command is inva= lid, not that the byte count itself is unexpected. In upstream FreeBSD mrsas, CAM I/O requests are submitted through the SIM action path, and the SIM is registered with mrsas_cam_poll as the CAM poll callback. That means polled I/O is used through the CAM poll path for this driver. Upstream also sets ccb->cpi.maxio based on sc->max_sectors_per_req * 512, so the driver advertises byte-sized transfer limits in this way. Request for review Since this issue is observed specifically during manual panic dump collecti= on, we would like help reviewing whether the crashed-kernel / panic-dump / polled-I/O path can result in a malformed or incomplete SGL being attached = to a host write request issued. In particular, could the panic-time environment, nofault state, or polled C= AM path cause the host command to carry an SGL that does not fully or correctly describe the requested 64KB transfer, even though normal runtime I/O may not visibly fail? The firmware team=E2=80=99s position is that the write failure is due to a = malformed host SGL, and their validation log is pointing to the host-issued command itself. Steps to reproduce: 1. Configure a RAID1 virtual disk on an mrsas controller. 2. Install FreeBSD on the RAID1 VD. 3. Configure crash dumps. 4. Trigger a manual panic. 5. Observe that dump collection starts and then fails with WRITE(10) I/O errors. 6. Firmware logs report invalidSgl=3D1 for the failing WRITE(10). --=20 You are receiving this mail because: You are the assignee for the bug.=