Date: Sun, 15 Mar 2026 10:31:50 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 293830] ahci: AMD SB7x0/SB8x0/SB9x0 unstable with MSI enabled (0x43911002) Message-ID: <bug-293830-227@https.bugs.freebsd.org/bugzilla/>
index | next in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=293830 Bug ID: 293830 Summary: ahci: AMD SB7x0/SB8x0/SB9x0 unstable with MSI enabled (0x43911002) Product: Base System Version: 14.4-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: vadlerg@freemail.hu Created attachment 268818 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=268818&action=edit patch for ahci: disable MSI for AMD SB7x0/SB8x0/SB9x0 (0x43911002) without disabling PMP Hardware: AMD SB7x0/SB8x0/SB9x0 AHCI controller PCI ID: 0x43911002 Problem: Disk drops offline under load when MSI interrupt mode is used. Observation: Switching quirk from AHCI_Q_1MSI to AHCI_Q_NOMSI fixes the problem. Test result: System stable after kernel rebuild and heavy disk load. Patch attached. I have a HP N40l server running on FreeBSD since ages. I had problem with dropping AHCI devices but solved it some times ago by adding hw.pci.enable_msi="0" to loader.conf and desabling every PCI MSI with it. I forgot about the problem in the passing years until recently I updated to 14.4 and reviewed the system file settings and removed the ominous PCI MSI disable line. My system begun to produce pool dropouts like: Mar 11 18:58:05 ZFSguru kernel: ada4 at ahcich4 bus 0 scbus5 target 0 lun 0 Mar 11 18:58:05 ZFSguru kernel: ada4: <ST16000VE000-2L2103 EV02> s/n ZL29XB7L detached Mar 11 18:58:22 ZFSguru kernel: Solaris: WARNING: Pool 'DOWN' has encountered an uncorrectable I/O failure and has been suspended. Mar 11 18:58:22 ZFSguru kernel: Mar 11 18:58:22 ZFSguru ZFS[16228]: pool I/O failure, zpool=DOWN error=6 Mar 11 18:58:22 ZFSguru ZFS[16232]: catastrophic pool I/O failure, zpool=DOWN First forgot about the removed line and did not found the culprit. The SMART values and everything was OK but the pools failed under stress despite adding hint.ahci.0.msi="0" hint.ahci.0.ccc="0" hint.ahcich.4.sata_rev="2" to device hints. Finally I remembered and put back the hw.pci.enable_msi="0" line to loader.conf and the problem is solved again. I've investigated further and found a patch for the ahci driver from 2018 which did not make it yet to the main codebase. It disables MSI and PMP (port multiplicator) functions for the chipset. Since I do not have any problem with port multiplication made a test with a kernel disabling only MSI and voila, the pools are working without dropout, without additional loader.conf or device.hints lines. The chipset I talking about: pciconf -lvbc | egrep -A4 -B2 'class=0x010601|AHCI|SATA' ecap 000b[100] = Vendor [1] ID 0001 Rev 1 Length 16 ecap 0002[110] = VC 1 max VC0 ahci0@pci0:0:17:0: class=0x010601 rev=0x40 hdr=0x00 vendor=0x1002 device=0x4391 subvendor=0x103c subdevice=0x1609 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]' class = mass storage subclass = SATA bar [10] = type I/O Port, range 32, base 0xc000, size 8, enabled bar [14] = type I/O Port, range 32, base 0xb000, size 4, enabled bar [18] = type I/O Port, range 32, base 0xa000, size 8, enabled bar [1c] = type I/O Port, range 32, base 0x9000, size 4, enabled -- bar [24] = type Memory, range 32, base 0xfe4ffc00, size 1024, enabled cap 05[50] = MSI supports 8 messages, 64 bit cap 12[70] = SATA Index-Data Pair cap 13[a4] = PCI Advanced Features: FLR TP ohci0@pci0:0:18:0: class=0x0c0310 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4397 subvendor=0x103c subdevice=0x1609 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 USB OHCI0 Controller' The patch insert NOMSI instead of 1MSI: sed -i '' -e '/{0x43911002, 0x00, "AMD SB7x0\/SB8x0\/SB9x0",/{ n s/AHCI_Q_ATI_PMP_BUG | AHCI_Q_1MSI/AHCI_Q_NOMSI | AHCI_Q_ATI_PMP_BUG/ }' sys/dev/ahci/ahci_pci.c diff --git a/sys/dev/ahci/ahci_pci.c b/sys/dev/ahci/ahci_pci.c @@ {0x43911002, 0x00, "AMD SB7x0/SB8x0/SB9x0", - AHCI_Q_ATI_PMP_BUG | AHCI_Q_1MSI}, + AHCI_Q_NOMSI | AHCI_Q_ATI_PMP_BUG}, -- You are receiving this mail because: You are the assignee for the bug.home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-293830-227>
