Date: Thu, 8 Jun 2017 14:13:01 -0700 From: Tim Gustafson <tjg@ucsc.edu> To: freebsd-fs@freebsd.org Subject: ZFS Commands In "D" State Message-ID: <CAPyBAS6Du=059Pef=wUC9D61mUEd%2BJXBxt6ydAdjypVYJptnvQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
We have a ZFS server that we've been running for a few months now. The server is a backup server that receives ZFS sends from its primary daily. This mechanism has been working for us on several pairs of servers for years in general, and for several months with this particular piece of hardware. A few days ago, our nightly ZFS send failed. When I looked at the server, I saw that the "zfs receive" command was in a "D" wait state: 1425 - D 0:02.75 /sbin/zfs receive -v -F backup/export I rebooted the system, checked that "zpool status" and "zfs list" both came back correctly (which they did) and then re-started the "zfs send" on the master server. At first, the "zfs receive" command did not enter the "D" state, but once the master server started sending actual data (which I was able to ascertain because I was doing "zfs send" with the -v option), the receiving process entered the "D" state again, and another reboot was required. Only about 2MB of data got sent before this happened. I've rebooted several times, always with the same result. I did a "zpool scrub os" (there's a separate zpool for the OS to live on) and that completed in a few minutes, but when I did a "zpool scrub backup", that process immediately went into the "D+" state: 895 0 D+ 0:00.04 zpool scrub backup We run smartd on this device, and that is showing no disk errors. The devd process is logging some stuff, but it doesn't appear to be very helpful: Jun 8 13:52:49 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=11754027336427262018 Jun 8 13:52:49 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=11367786800631979308 Jun 8 13:52:49 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=18407069648425063426 Jun 8 13:52:49 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=9496839124651172990 Jun 8 13:52:49 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=332784898986906736 Jun 8 13:52:50 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=16384086680948393578 Jun 8 13:52:50 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=10762348983543761591 Jun 8 13:52:50 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=8585274278710252761 Jun 8 13:52:50 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=17456777842286400332 Jun 8 13:52:50 backup ZFS: vdev state changed, pool_guid=2176924632732322522 vdev_guid=10533897485373019500 No word on which state it changed "from" or "to". Also, the system only has three vdevs (the OS one, and then two raidz2 vdevs that make up the "backup" pool, so I'm not sure how it's coming up with more than 3 vdev GUIDs). What's my next step in diagnosing this? -- Tim Gustafson BSOE Computing Director tjg@ucsc.edu 831-459-5354 Baskin Engineering, Room 313A To request BSOE IT support, please visit https://support.soe.ucsc.edu/ or send e-mail to help@soe.ucsc.edu.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPyBAS6Du=059Pef=wUC9D61mUEd%2BJXBxt6ydAdjypVYJptnvQ>