FreeBSD Mail Archives

Date:      Mon, 26 Oct 2020 22:50:18 +0100
From:      Juraj Lutter <juraj@lutter.sk>
To:        freebsd-stable@freebsd.org
Subject:   Interrupt problems(?) on Dell R740xd
Message-ID:  <9FD07762-5744-480C-A289-DDB09730A74D@lutter.sk>

next in thread | raw e-mail | index | archive | help

Hi,

on a Dell R740xd with:
- 22x nvm0: Dell Express Flash PM1725b 1.6TB SFF
- 2x ATA SSDSC2KG240G8R
- 2 package(s) x 8 core(s) x 2 hardware threads
- 256GB RAM

running 12.2-STABLE r367058 I've run into a problem where under some =
time, the machine
locks up in certain operations (mkdir, for example, not always the =
same). In top output,
similar entries can be seen:

   12 root        -80    -     0B  7936K WAIT     0   0:05   0.00% =
intr{irq48: pcib12+++}
   12 root        -88    -     0B  7936K WAIT     6   0:05   0.00% =
intr{irq16: ahci0 xhci0*}
   12 root        -80    -     0B  7936K WAIT     8   0:05   0.00% =
intr{irq53: pcib16++}
   12 root        -80    -     0B  7936K WAIT    12   0:05   0.00% =
intr{irq54: pcib17++}

For example, running poudriere:
4124  1  I+      0:00.21 /usr/local/libexec/poudriere/sh -e =
/usr/local/share/poudriere/bulk.sh
4217  1  D+      0:00.00 cap_mkdb =
/poudriere/build/data/.m/12sgx64-default/ref/etc/login.conf

And then even the root pool is getting checksum errors, with subseqent =
scrub needed:
Oct 26 11:55:42 bnts-nvs-n1 ZFS[4117]: pool I/O failure, zpool=3D$zroot =
error=3D$97
Oct 26 11:55:42 bnts-nvs-n1 ZFS[4118]: checksum mismatch, zpool=3D$zroot =
path=3D$/dev/da0p3 offset=3D$30089228288 size=3D$53248
Oct 26 11:55:42 bnts-nvs-n1 ZFS[4119]: checksum mismatch, zpool=3D$zroot =
path=3D$/dev/da1p3 offset=3D$30089228288 size=3D$53248
Oct 26 11:55:49 bnts-nvs-n1 ZFS[4121]: pool I/O failure, zpool=3D$zroot =
error=3D$97
Oct 26 11:56:26 bnts-nvs-n1 ZFS[4239]: pool I/O failure, zpool=3D$zroot =
error=3D$97

This all happens when "increased" I/O is going via mrsas-attached disks:
AVAGO MegaRAID SAS FreeBSD mrsas driver version: 07.709.04.00-fbsd
mrsas0: <AVAGO Invader SAS Controller> port 0x4000-0x40ff mem =
0x9db00000-0x9db0ffff,0x9da00000-0x9dafffff irq 32 at device 0.0 =
numa-domain 0 on pci4
mrsas0: FW now in Ready state
mrsas0: Using MSI-X with 32 number of vectors
mrsas0: FW supports <96> MSIX vector,Online CPU 32 Current MSIX <32>
mrsas0: max sge: 0x46, max chain frame size: 0x400, max fw cmd: 0x39f
mrsas0: Issuing IOC INIT command to FW.
mrsas0: IOC INIT response received from FW.
mrsas0: System PD created target ID: 0x0
mrsas0: System PD created target ID: 0x1
mrsas0: FW supports: UnevenSpanSupport=3D1
mrsas0: max_fw_cmds: 927  max_scsi_cmds: 911
mrsas0: MSI-x interrupts setup success
mrsas0: mrsas_ocr_thread

Internal disks are:
<ATA SSDSC2KG240G8R DL67>          at scbus17 target 0 lun 0 (pass2,da0)
<ATA SSDSC2KG240G8R DL67>          at scbus17 target 1 lun 0 (pass3,da1)

Example:
da0 at mrsas0 bus 1 scbus17 target 0 lun 0
da0: <ATA SSDSC2KG240G8R DL67> Fixed Direct Access SPC-4 SCSI device
da0: Serial Number BTYG01730DP5240AGN
da0: 150.000MB/s transfers
da0: 228936MB (468862128 512 byte sectors)

Internal AHCI is:
pci0: <ACPI PCI bus> numa-domain 0 on pcib0
pci0: <dasp, performance counters> at device 8.1 (no driver attached)
pci0: <unknown> at device 17.0 (no driver attached)
ahci0: <Intel Lewisburg AHCI SATA controller>
ahci0: AHCI v1.31 with 6 6Gbps ports, Port Multiplier not supported
ahci1: <Intel Lewisburg AHCI SATA controller>
ahci1: AHCI v1.31 with 8 6Gbps ports, Port Multiplier not supported

sesutil map excerpt:
ses0:
        Enclosure Name: AHCI SGPIO Enclosure 2.00
        Enclosure ID: 3061686369656d30
        Element 0, Type: Array Device Slot
                Status: Unsupported (0x00 0x00 0x00 0x00)
                Description: Drive Slots


NVME disks are:
nda0 at nvme0 bus 0 scbus19 target 0 lun 1
nda0: <Dell Express Flash PM1725b 1.6TB SFF 1.1.0 S5CUNA0N201038>
nda0: Serial Number S5CUNA0N201038
nda0: nvme version 1.2 x4 (max x4) lanes PCIe Gen3 (max Gen3) link
nda0: 1526185MB (3125627568 512 byte sectors)

The machine also has 4x bge and 4x bnxt.

With hw.pci.enable_msi=3D"0" set, it's slightly better, with =
hw.pci.enable_msi=3D"1",
it happens more often and under even lower load than with enable_msi=3D0.
enable_msix is set to 1.

Once the machine locks up, one or more of the following also appears:
bge2: Interface stopped DISTRIBUTING, possible flapping - this might be =
caused by stuck interrupt(?)
nvme0: Missing interrupt

The only way out is to reboot.
And I wonder, what steps could I take to narrow down the source of the =
problem?
The machine is not yet in production, I even can try a -CURRENT on it, =
as a last resort.

The one thing I=E2=80=99m also considering is to disable USB in order to =
not share interrupt(s) with ahci.
The weird thing is that it can survive a full buildworld with 1 make =
job, but not with 32 or even 16.

Did anyone came across something like this?
Any hints are welcome.

Thanks.

=E2=80=94
Juraj Lutter
XMPP: juraj (at) lutter.sk
GSM: +421907986576

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9FD07762-5744-480C-A289-DDB09730A74D>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation