Date: Thu, 8 Jun 2017 15:57:43 -0700 From: Xin LI <delphij@gmail.com> To: Tim Gustafson <tjg@ucsc.edu> Cc: freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: ZFS Commands In "D" State Message-ID: <CAGMYy3vUsfzTYknt8yENzP8xqR0V4wkNfSDmtuS65U=yL1trzQ@mail.gmail.com> In-Reply-To: <CAPyBAS6Du=059Pef=wUC9D61mUEd%2BJXBxt6ydAdjypVYJptnvQ@mail.gmail.com> References: <CAPyBAS6Du=059Pef=wUC9D61mUEd%2BJXBxt6ydAdjypVYJptnvQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
procstat -kk 1425 ? (or whatever PID's that is stuck in "D" state) On Thu, Jun 8, 2017 at 2:13 PM, Tim Gustafson <tjg@ucsc.edu> wrote: > We have a ZFS server that we've been running for a few months now. > The server is a backup server that receives ZFS sends from its primary > daily. This mechanism has been working for us on several pairs of > servers for years in general, and for several months with this > particular piece of hardware. > > A few days ago, our nightly ZFS send failed. When I looked at the > server, I saw that the "zfs receive" command was in a "D" wait state: > > 1425 - D 0:02.75 /sbin/zfs receive -v -F backup/export > > I rebooted the system, checked that "zpool status" and "zfs list" both > came back correctly (which they did) and then re-started the "zfs > send" on the master server. At first, the "zfs receive" command did > not enter the "D" state, but once the master server started sending > actual data (which I was able to ascertain because I was doing "zfs > send" with the -v option), the receiving process entered the "D" state > again, and another reboot was required. Only about 2MB of data got > sent before this happened. > > I've rebooted several times, always with the same result. I did a > "zpool scrub os" (there's a separate zpool for the OS to live on) and > that completed in a few minutes, but when I did a "zpool scrub > backup", that process immediately went into the "D+" state: > > 895 0 D+ 0:00.04 zpool scrub backup > > We run smartd on this device, and that is showing no disk errors. The > devd process is logging some stuff, but it doesn't appear to be very > helpful: > > Jun 8 13:52:49 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=11754027336427262018 > Jun 8 13:52:49 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=11367786800631979308 > Jun 8 13:52:49 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=18407069648425063426 > Jun 8 13:52:49 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=9496839124651172990 > Jun 8 13:52:49 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=332784898986906736 > Jun 8 13:52:50 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=16384086680948393578 > Jun 8 13:52:50 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=10762348983543761591 > Jun 8 13:52:50 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=8585274278710252761 > Jun 8 13:52:50 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=17456777842286400332 > Jun 8 13:52:50 backup ZFS: vdev state changed, > pool_guid=2176924632732322522 vdev_guid=10533897485373019500 > > No word on which state it changed "from" or "to". Also, the system > only has three vdevs (the OS one, and then two raidz2 vdevs that make > up the "backup" pool, so I'm not sure how it's coming up with more > than 3 vdev GUIDs). > > What's my next step in diagnosing this? > > -- > > Tim Gustafson > BSOE Computing Director > tjg@ucsc.edu > 831-459-5354 > Baskin Engineering, Room 313A > > To request BSOE IT support, please visit https://support.soe.ucsc.edu/ > or send e-mail to help@soe.ucsc.edu. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGMYy3vUsfzTYknt8yENzP8xqR0V4wkNfSDmtuS65U=yL1trzQ>