Date: Thu, 12 Oct 2023 20:40:57 -0700 From: Pete Wright <pete@nomadlogic.org> To: freebsd-current@freebsd.org Subject: nvme timeout issues with hardware and bhyve vm's Message-ID: <90d3e532-8ea7-4eea-8e31-8c363285a156@nomadlogic.org>
next in thread | raw e-mail | index | archive | help
hey there - i was curious if anyone has had issues with nvme devices recently. i'm chasing down similar issues on my workstation which has a physical NVMe zroot, and on a bhyve VM which has a large pool exposed as a NVMe device (and is backed by a zvol). on the most recent bhyve issue the VM reported this: Oct 13 02:52:52 emby kernel: nvme1: RECOVERY_START 13737432416007567 vs 13737432371683671 Oct 13 02:52:52 emby kernel: nvme1: RECOVERY_START 13737432718499597 vs 13737432371683671 Oct 13 02:52:52 emby kernel: nvme1: timeout with nothing complete, resetting Oct 13 02:52:52 emby kernel: nvme1: Resetting controller due to a timeout. Oct 13 02:52:52 emby kernel: nvme1: RECOVERY_WAITING Oct 13 02:52:52 emby kernel: nvme1: resetting controller Oct 13 02:52:53 emby kernel: nvme1: waiting Oct 13 02:53:23 emby syslogd: last message repeated 114 times Oct 13 02:53:23 emby kernel: nvme1: controller ready did not become 1 within 30500 ms Oct 13 02:53:23 emby kernel: nvme1: failing outstanding i/o Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:1 cid:119 nsid:1 lba:4968850592 len:256 Oct 13 02:53:23 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:1 sqid:1 cid:119 cdw0:0 Oct 13 02:53:23 emby kernel: nvme1: failing outstanding i/o Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:6 cid:0 nsid:1 lba:5241952432 len:32 Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:3 cid:123 nsid:1 lba:4968850336 len:256 Oct 13 02:53:23 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:1 sqid:3 cid:123 cdw0:0 Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:3 cid:0 nsid:1 lba:5242495888 len:256 Oct 13 02:53:23 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:3 cid:0 cdw0:0 Oct 13 02:53:23 emby kernel: nvme1: READ sqid:3 cid:0 nsid:1 lba:528 len:16 Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:5 cid:0 nsid:1 lba:4934226784 len:96 Oct 13 02:53:23 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:3 cid:0 cdw0:0 Oct 13 02:53:23 emby kernel: nvme1: READ sqid:3 cid:0 nsid:1 lba:6442449936 len:16 Oct 13 02:53:25 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:3 cid:0 cdw0:0 Oct 13 02:53:25 emby kernel: nvme1: READ sqid:3 cid:0 nsid:1 lba:6442450448 len:16 Oct 13 02:53:25 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:3 cid:0 cdw0:0 Oct 13 02:53:25 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:5 cid:0 cdw0:0 Oct 13 02:53:25 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:6 cid:0 cdw0:0 Oct 13 02:53:25 emby kernel: nvd1: detached I had similar issues on my workstation as well. Scrubbing the NVMe device on my real-hardware workstation hasn't turned up any issues, but the system has locked up a handful of times. Just curious if others have seen the same, or if someone could point me in the right direction... thanks! -pete -- Pete Wright pete@nomadlogic.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?90d3e532-8ea7-4eea-8e31-8c363285a156>