From owner-freebsd-bugs@freebsd.org Tue Aug 25 17:58:54 2020 Return-Path: Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 6F9463B400F for ; Tue, 25 Aug 2020 17:58:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4BbcDt2NVbz3Xpt for ; Tue, 25 Aug 2020 17:58:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 51C843B3F8F; Tue, 25 Aug 2020 17:58:54 +0000 (UTC) Delivered-To: bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 518E53B3BCF for ; Tue, 25 Aug 2020 17:58:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BbcDt1XXZz3XZ5 for ; Tue, 25 Aug 2020 17:58:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 197AD2DB39 for ; Tue, 25 Aug 2020 17:58:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 07PHwr6C015906 for ; Tue, 25 Aug 2020 17:58:53 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 07PHwrcQ015904 for bugs@FreeBSD.org; Tue, 25 Aug 2020 17:58:53 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 248906] LSI SAS2008 (mps) gets stuck in a reset loop when writing on AMD Epyc 3000 Date: Tue, 25 Aug 2020 17:58:53 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: Unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: paxswill@paxswill.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Aug 2020 17:58:54 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D248906 Bug ID: 248906 Summary: LSI SAS2008 (mps) gets stuck in a reset loop when writing on AMD Epyc 3000 Product: Base System Version: Unspecified Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: paxswill@paxswill.com Overview: I'm trying to use an LSI SAS2008 based PCIe card with an AMD Epyc 3151 syst= em. Once I try to write anything to a drive connected to the card, the mps driv= er appears to get stuck in a reset loop, repeating messages like this: mps0: IOC Fault 0x40002622, Resetting mps0: Reinitializing controller, mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd mps0: IOCCapabilities: 1285c mps0: mps_reinit finished sc 0xfffffe00014a9000 post 4 free 3 mps0: SAS Address for SATA device =3D 2a04546ea96c8bac mps0: SAS Address from SATA device =3D 2a04546ea96c8bac mps0: SAS Address for SATA device =3D d9413b15bbcdcc78 mps0: SAS Address from SATA device =3D d9413b15bbcdcc78 There's then a pause for a few seconds, and these messages are printed again (none of the values change). Reproduction Steps: 1. Set up hardware with a SuperMicro M11SDV-4C-LN4F and an LSI SAS2008 HBA = PCIe card that's been reflashed to the IT firmware. Connect a SATA disk to the H= BA. 2. Boot FreeBSD (off of install media, another disk, etc). 3. Once booted, check dmesg to see the name of the SATA disk (ex: da0) 4. Run `dd if=3D/dev/zero of=3D/dev/da0` Expected: Zeros are successfully written to the disk. Actual: mps driver gets stuck in a reset loop. Comments: * I've tested two different cards (one reflashed by me, another bought off = of eBay pre-flashed), and they both exhibit this issue. * Ubuntu is able to use both cards. * I've tested both an SSD and HDD, with no difference. * This machine is specifically running FreeNAS 11.3-U4.1 (FreeBSD 11.3p11 equivalent). I encountered the same issue with FreeBSD 12.1-RELEASE as well. * I haven't had a chance to try them in another Intel system yet, but will update this issue once I have. * Reads work fine (tested with `dd if=3D/dev/da0 of=3D/temp/read_test`). Th= e data is as expected. * smartctl is: * Able to read SMART values off of drives. * Run a background test runs successfully. * Running a foreground test fails. After waiting 1 minute, smartctl exi= ts. Checking the SMART test log shows that the test was "Interrupted (host rese= t)" without completing, and these messages are logged by the system: (pass1:mps0:0:5:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 0c 00 = d4 00 00 00 81 00 4f 00 c2 00 b0 00 length 0 SMID 700 Aborting command 0xfffffe00015246c0 mps0: Sending reset from mpssas_send_abort for target ID 5 mps0: Unfreezing devq for target ID The --=20 You are receiving this mail because: You are the assignee for the bug.=