Date: Wed, 18 Oct 2017 10:48:07 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 223085] ZFS Resilver not completing - stuck at 99% Message-ID: <bug-223085-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=223085 Bug ID: 223085 Summary: ZFS Resilver not completing - stuck at 99% Product: Base System Version: 10.2-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: paul@vsl-net.com I have a number of FreeBSD system with large (30TB) ZFS pools. I have had several disks fail over time and have seen problems with resilvers either not completing or getting to 99% within a week but then taking a further month to complete. I have been seeking advice in the forums. https://forums.freebsd.org/threads/61643/#post-355088 A system that has a disk replaced some time ago is in this state pool: s11d34 state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Thu Sep 14 15:08:15 2017 49.4T scanned out of 49.8T at 17.7M/s, 6h13m to go 4.93T resilvered, 99.24% done config: NAME STATE READ WRITE CKSUM s11d34 DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 multipath/J11F18-1EJB8KUJ ONLINE 0 0 0 multipath/J11R01-1EJ2XT4F ONLINE 0 0 0 multipath/J11R02-1EHZE2GF ONLINE 0 0 0 multipath/J11R03-1EJ2XTMF ONLINE 0 0 0 multipath/J11R04-1EJ3NK4J ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 multipath/J11R05-1EJ2Z8AF ONLINE 0 0 0 multipath/J11R06-1EJ2Z8NF ONLINE 0 0 0 replacing-2 OFFLINE 0 0 0 7444569586532474759 OFFLINE 0 0 0 was /dev/multipath/J11R07-1EJ03GXJ multipath/J11F23-1EJ3AJBJ ONLINE 0 0 0 (resilvering) multipath/J11R08-1EJ3A0HJ ONLINE 0 0 0 multipath/J11R09-1EJ32UPJ ONLINE 0 0 0 It got to 99.24% within a week but has stuck there since. I have stopped ALL access to the pool and ran zpool iostat and there is still activity (although low e.g. 1.2M read, 1.78M write etc...) so it does appear to be doing something. The disks (6TB or 8TB HGST SAS) are attached via an LSI 9207-8e HBA which is connected to a LSI 6160 SAS Switch that is connected to a Supermicro JBOD. The HBA's have 2 connectors, each is connected to a different SAS switch. The system sees the disk twice as expected and I use gmultipath to label the disks and set in Active/Passive mode, I then use the multipath name during zpool create e.g. root@freebsd04:~ # gmultipath status Name Status Components multipath/J11R00-1EJ2XR5F OPTIMAL da0 (ACTIVE) da11 (PASSIVE) multipath/J11R01-1EJ2XT4F OPTIMAL da1 (ACTIVE) da12 (PASSIVE) multipath/J11R02-1EHZE2GF OPTIMAL da2 (ACTIVE) da13 (PASSIVE) zpool create -f store43 raidz2 multipath/J11R00-1EJ2XR5F multipath/J11R01-1EJ2XT4F etc....... Any advice if this is a bug or something wrong with my setup? Thanks Paul -- You are receiving this mail because: You are the assignee for the bug.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-223085-8>
