Date: Mon, 04 Nov 2024 17:28:42 +0000 From: "Dave Cottlehuber" <dch@FreeBSD.org> To: freebsd-fs <freebsd-fs@freebsd.org> Subject: nvme device errors & zfs Message-ID: <3293802b-3785-4715-8a6b-0802afb6f908@app.fastmail.com>
next in thread | raw e-mail | index | archive | help
What's the best way to see error counters or states on an nvme device? I have a typical mirrored nvme zpool, that reported enough errors in a burst last week, that 1 drive dropped off the bus [1]. After a reboot, it resilvered, I cleared the errors, and it seems fine according to repeated scrubs and a few days of use. I was unable to see any errors from the nvme drive itself, but as its (just) in warranty for 2 more weeks I'd like to know if I should return it. I installed ports `sysutils/nvme-cli` and didn't see anything=20 of note there either: $ doas nvme smart-log /dev/nvme1 0xc0484e41: opc: 0x2 fuse: 0 cid 0 nsid:0xffffffff cmd2: 0 cmd3: 0 : cdw10: 0x7f0002 cdw11: 0 cdw12: 0 cdw13: 0 : cdw14: 0 cdw15: 0 len: 0x200 is_read: 0 <--- 0 cid: 0 status 0 Smart Log for NVME device:nvme1 namespace-id:ffffffff critical_warning : 0 temperature : 39 C available_spare : 100% available_spare_threshold : 10% percentage_used : 3% data_units_read : 121681067 data_units_written : 86619659 host_read_commands : 695211450 host_write_commands : 2187823697 controller_busy_time : 2554 power_cycles : 48 power_on_hours : 6342 unsafe_shutdowns : 38 media_errors : 0 num_err_log_entries : 0 Warning Temperature Time : 0 Critical Composite Temperature Time : 0 Temperature Sensor 1 : 39 C Temperature Sensor 2 : 43 C Thermal Management T1 Trans Count : 0 Thermal Management T2 Trans Count : 0 Thermal Management T1 Total Time : 0 Thermal Management T2 Total Time : 0 =20 [1]: zpool status status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning i= n a degraded state. action: Replace the faulted device, or use 'zpool clear' to mark the dev= ice repaired. scan: scrub repaired 0B in 00:17:59 with 0 errors on Thu Oct 31 16:24:= 36 2024 config: NAME STATE READ WRITE CKSUM zroot DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 gpt/zfs0 ONLINE 0 0 0 gpt/zfs1 FAULTED 0 0 0 too many errors A+ Dave =E2=80=94=E2=80=94=E2=80=94 O for a muse of fire, that would ascend the brightest heaven of inventio= n!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3293802b-3785-4715-8a6b-0802afb6f908>