Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jun 2023 18:59:02 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 272135] hot-swap NVMe drive not consistently detected
Message-ID:  <bug-272135-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D272135

            Bug ID: 272135
           Summary: hot-swap NVMe drive not consistently detected
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: emaste@freebsd.org

Test system has 24 2.5" slots. src tree is
194e059bb80334e6f4f791a186015b20d7f6f4b8 + some unrelated local changes. Te=
st
NVMe drive is WD Blue SN570 500GB 234110WD.

The NVMe drive is not detected when installed at boot and not consistently
detected when inserted after boot. It is sometimes detected if removed from=
 and
inserted into the same slot, and sometimes after moving to a different slot.
Additional detail to be added after further investigation.

Example insert/removal event timeline showing verbose kernel messages:

** boot with NVMe installed in slot 0

<no NVMe-related kernel messages>

** detach NVMe in slot 0

<no kernel messages>

** insert NVMe in slot 0

Jun  2 20:40:49 xxx kernel: pcib7: HotPlug interrupt: 0x48
Jun  2 20:40:49 xxx kernel: pcib7: Presence Detect Changed to card present
Jun  2 20:40:49 xxx kernel: pci24: <ACPI PCI bus> numa-domain 0 on pcib7
Jun  2 20:40:49 xxx kernel: pcib7: allocated bus range (65-65) for rid 0 of
pci24
Jun  2 20:40:49 xxx kernel: pci24: domain=3D0, physical bus=3D65
Jun  2 20:40:49 xxx kernel: pcib7: HotPlug interrupt: 0x140
Jun  2 20:40:49 xxx kernel: pcib7: Data Link Layer State Changed to active
Jun  2 20:40:49 xxx kernel: pcib7: HotPlug interrupt: 0x140
Jun  2 20:40:49 xxx kernel: pcib7: Data Link Layer State Changed to active

** detach NVMe in slot 0

Jun  2 20:41:32 qrb16 kernel: pcib7: HotPlug interrupt: 0x8
Jun  2 20:41:32 qrb16 kernel: pcib7: Presence Detect Changed to empty
Jun  2 20:41:32 qrb16 kernel: pci24: detached

** insert NVMe in slot 0

<no kernel messages>

** detach NVMe in slot 0

<no kernel messages>

** insert NVMe in slot 1

Jun  2 20:43:35 xxx kernel: pcib8: HotPlug interrupt: 0x48
Jun  2 20:43:35 xxx kernel: pcib8: Presence Detect Changed to card present
Jun  2 20:43:35 xxx kernel: pcib8: Missed HotPlug interrupt waiting for DLL
Active
Jun  2 20:43:35 xxx kernel: pcib8: HotPlug interrupt: 0x140
Jun  2 20:43:35 xxx kernel: pcib8: Data Link Layer State Changed to active
Jun  2 20:43:35 xxx kernel: pci24: <ACPI PCI bus> numa-domain 0 on pcib8
Jun  2 20:43:35 xxx kernel: pcib8: allocated bus range (66-66) for rid 0 of
pci24
Jun  2 20:43:35 xxx kernel: pci24: domain=3D0, physical bus=3D66
Jun  2 20:43:35 xxx kernel: found->     vendor=3D0x15b7, dev=3D0x501a, revi=
d=3D0x00
Jun  2 20:43:35 xxx kernel:     domain=3D0, bus=3D66, slot=3D0, func=3D0
Jun  2 20:43:35 xxx kernel:     class=3D01-08-02, hdrtype=3D0x00, mfdev=3D0
Jun  2 20:43:35 xxx kernel:     cmdreg=3D0x0000, statreg=3D0x0010, cachelns=
z=3D0
(dwords)
Jun  2 20:43:35 xxx kernel:     lattimer=3D0x00 (0 ns), mingnt=3D0x00 (0 ns=
),
maxlat=3D0x00 (0 ns)
Jun  2 20:43:35 xxx kernel:     intpin=3Da, irq=3D255=20
Jun  2 20:43:35 xxx kernel:     powerspec 3  supports D0 D3  current D0
Jun  2 20:43:35 xxx kernel:     MSI supports 32 messages, 64 bit
Jun  2 20:43:35 xxx kernel:     MSI-X supports 17 messages in maps 0x10 and
0x20
Jun  2 20:43:35 xxx kernel:     map[10]: type Memory, range 64, base 0, size
14, memory disabled
Jun  2 20:43:35 xxx kernel:     map[20]: type Memory, range 64, base 0, siz=
e=20
8, memory disabled
Jun  2 20:43:35 xxx kernel: nvme1: <Generic NVMe Device> at device 0.0
numa-domain 0 on pci24
Jun  2 20:43:35 xxx kernel: pcib6: allocated type 3 (0xce000000-0xce0fffff)=
 for
rid 20 of pcib8
Jun  2 20:43:35 xxx kernel: pcib8: allocated initial memory window of
0xce000000-0xce0fffff
Jun  2 20:43:35 xxx kernel: pcib8: allocated memory range
(0xce000000-0xce003fff) for rid 10 of nvme1
Jun  2 20:43:35 xxx kernel: nvme1: Lazy allocation of 0x4000 bytes rid 0x10
type 3 at 0xce000000
Jun  2 20:43:35 xxx kernel: pcib8: allocated memory range
(0xce004000-0xce0040ff) for rid 20 of nvme1
Jun  2 20:43:35 xxx kernel: nvme1: Lazy allocation of 0x100 bytes rid 0x20 =
type
3 at 0xce004000
Jun  2 20:43:35 xxx kernel: nvme1: attempting to allocate 17 MSI-X vectors =
(17
supported)
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 340 to local APIC 52 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 341 to local APIC 54 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 342 to local APIC 56 vec=
tor
48=20
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 343 to local APIC 58 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 344 to local APIC 60 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 345 to local APIC 62 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 346 to local APIC 64 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 347 to local APIC 66 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 348 to local APIC 68 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 349 to local APIC 70 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 350 to local APIC 72 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 351 to local APIC 74 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 352 to local APIC 76 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 353 to local APIC 78 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 354 to local APIC 80 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 355 to local APIC 82 vec=
tor
48
Jun  2 20:43:35 xxx kernel: msi: routing MSI-X IRQ 356 to local APIC 84 vec=
tor
48
Jun  2 20:43:35 xxx kernel: nvme1: using IRQs 340
Jun  2 20:43:35 xxx kernel: -356
Jun  2 20:43:35 xxx kernel:  for MSI-X
Jun  2 20:43:35 xxx kernel: nvme1: CapLo: 0x140103ff: MQES 1023, CQR, TO 20
Jun  2 20:43:35 xxx kernel: nvme1: CapHi: 0x00000030: DSTRD 0, NSSRS, CSS 1,
MPSMIN 0, MPSMAX 0
Jun  2 20:43:35 xxx kernel: nvme1: Version: 0x00010400: 1.4
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 341 to local APIC 17
vector 48
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 342 to local APIC 49
vector 48
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 343 to local APIC 81
vector 48
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 344 to local APIC 113
vector 48
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 345 to local APIC 145
vector 48
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 346 to local APIC 177
vector 48
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 347 to local APIC 209
vector 48
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 348 to local APIC 241
vector 48
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 349 to local APIC 17
vector 49=20
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 350 to local APIC 49
vector 49=20
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 351 to local APIC 81
vector 49=20
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 352 to local APIC 113
vector 49
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 353 to local APIC 145
vector 49
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 354 to local APIC 177
vector 49
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 355 to local APIC 209
vector 49
Jun  2 20:43:35 xxx kernel: msi: Assigning MSI-X IRQ 356 to local APIC 241
vector 49
Jun  2 20:43:35 xxx kernel: nvme1: Allocated 200MB host memory buffer
Jun  2 20:43:35 xxx kernel: nda0 at nvme1 bus 0 scbus16 target 0 lun 1
Jun  2 20:43:35 xxx kernel: nda0:=20=20
Jun  2 20:43:35 xxx kernel: <WD Blue SN570 500GB 234110WD xxx>
Jun  2 20:43:35 xxx kernel: nda0: Serial Number xxx
Jun  2 20:43:35 xxx kernel: nda0: nvme version 1.4 x4 (max x4) lanes PCIe G=
en3
(max Gen3) link
Jun  2 20:43:35 xxx kernel: nda0: 476940MB (976773168 512 byte sectors)
Jun  2 20:43:35 xxx kernel: GEOM: new disk nda0
Jun  2 20:43:35 xxx kernel: pass1 at nvme1 bus 0 scbus16 target 0 lun 1
Jun  2 20:43:35 xxx kernel: pass1: <WD Blue SN570 500GB 234110WD 22455X8057=
70>
Jun  2 20:43:35 xxx kernel: pass1: Serial Number xxx
Jun  2 20:43:35 xxx kernel: pass1: nvme version 1.4 x4 (max x4) lanes PCIe =
Gen3
(max Gen3) link

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-272135-227>