Date: Thu, 4 Oct 2018 17:32:46 +0200 From: Jean-Marc LACROIX <Jean-Marc.Lacroix@unice.fr> To: freebsd-fs@freebsd.org Subject: Problem with ZFSD when replacing a failed disk Message-ID: <96665eb8-9ff6-9050-bd7d-7f9f1d3fe737@unice.fr>
next in thread | raw e-mail | index | archive | help
Hello, we encounter a problem on our storage solution which is constituted of 1 R630 + 3 JOBS MD1420 sas connected.(DELL) Our system is : 11.0-RELEASE-p9 We use ZFSD to manage Spare disks The problem appear when we have to change a failed disk as explained below. We don't understand why the second spare is activated when the replace command is done; Thanks in advance for your help. Regards JM root@math12:/ # zpool status pool: zpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: resilvered 238G in 11h3m with 0 errors on Sat Sep 29 19:47:28 2018 config: NAME STATE READ WRITE CKSUM zpool DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 label/e0s0 ONLINE 0 0 0 label/e1s0 ONLINE 0 0 0 label/e2s0 ONLINE 0 0 0 label/e0s1 ONLINE 0 0 0 label/e1s1 ONLINE 0 0 0 label/e2s1 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 label/e0s2 ONLINE 0 0 0 label/e1s2 ONLINE 0 0 0 label/e2s2 ONLINE 0 0 0 label/e0s3 ONLINE 0 0 0 label/e1s3 ONLINE 0 0 0 label/e2s3 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 label/e0s4 ONLINE 0 0 0 label/e1s4 ONLINE 0 0 0 label/e2s4 ONLINE 0 0 0 label/e0s5 ONLINE 0 0 0 label/e1s5 ONLINE 0 0 0 label/e2s5 ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 label/e0s6 ONLINE 0 0 0 label/e1s6 ONLINE 0 0 0 label/e2s6 ONLINE 0 0 0 label/e0s7 ONLINE 0 0 0 label/e1s7 ONLINE 0 0 0 label/e2s7 ONLINE 0 0 0 raidz2-4 ONLINE 0 0 0 label/e0s8 ONLINE 0 0 0 label/e1s8 ONLINE 0 0 0 label/e2s8 ONLINE 0 0 0 label/e0s9 ONLINE 0 0 0 label/e1s9 ONLINE 0 0 0 label/e2s9 ONLINE 0 0 0 raidz2-5 DEGRADED 0 0 0 label/e0s10 ONLINE 0 0 0 label/e1s10 ONLINE 0 0 0 label/e2s10 ONLINE 0 0 0 label/e0s11 ONLINE 0 0 0 spare-4 UNAVAIL 0 0 0 9796387366129075446 UNAVAIL 0 0 0 was /dev/label/e1s11 label/spare0 ONLINE 0 0 0 label/e2s11 ONLINE 0 0 0 raidz2-6 ONLINE 0 0 0 label/e0s12 ONLINE 0 0 0 label/e1s12 ONLINE 0 0 0 label/e2s12 ONLINE 0 0 0 label/e0s13 ONLINE 0 0 0 label/e1s13 ONLINE 0 0 0 label/e2s13 ONLINE 0 0 0 raidz2-7 ONLINE 0 0 0 label/e0s14 ONLINE 0 0 0 label/e1s14 ONLINE 0 0 0 label/e2s14 ONLINE 0 0 0 label/e0s15 ONLINE 0 0 0 label/e1s15 ONLINE 0 0 0 label/e2s15 ONLINE 0 0 0 raidz2-8 ONLINE 0 0 0 label/e0s16 ONLINE 0 0 0 label/e1s16 ONLINE 0 0 0 label/e2s16 ONLINE 0 0 0 label/e0s17 ONLINE 0 0 0 label/e1s17 ONLINE 0 0 0 label/e2s17 ONLINE 0 0 0 logs mirror-9 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 spares 12553822586401141982 INUSE was /dev/label/spare0 label/spare1 AVAIL errors: No known data errors =========================================================================== STEPS DONE: 1) unplug the blinking UNAVAIL disk 2) plug the new disk in the free slot => the system console shows the new device name: ugen0.4: <Avocent> at usbus0 umass0: <SCSI Transparent Interface 0> on usbus0 umass0: SCSI over Bulk-Only; quirks = 0x4100 umass0:16:0: Attached to scbus16 da59 at umass-sim0 bus 0 scbus16 target 0 lun 0 da59: <iDRAC DRACRW 0329> Removable Direct Access SCSI device da59: 40.000MB/s transfers da59: 308MB (630784 512 byte sectors) da59: quirks=0x2<NO_6_BYTE> ugen0.4: <Avocent> at usbus0 (disconnected) umass0: at uhub4, port 2, addr 4 (disconnected) da59 at umass-sim0 bus 0 scbus16 target 0 lun 0 da59: <iDRAC DRACRW 0329> detached (da59:umass-sim0:0:0:0): Periph destroyed (da59:mrsas1:1:61:0): UNMAPPED da59 at mrsas1 bus 1 scbus3 target 61 lun 0 da59: <SEAGATE ST91000640SS AS0B> Fixed Direct Access SPC-4 SCSI device da59: Serial Number 9XG9RH37 da59: 150.000MB/s transfers da59: 953869MB (1953525168 512 byte sectors) glabel label e1s11 /dev/da59 glabel status da59 => the disk is correctly labeled zpool replace zpool 9796387366129075446 label/e1s11 zpool status => PROBLEM DESCRIBED BELOW ========================================================================== AFTER the replace command, we can see that the second hotspare has been activated as follow: root@math12:/ # zpool status pool: zpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Thu Oct 4 15:39:49 2018 214G scanned out of 12.6T at 191M/s, 18h49m to go 7.89G resilvered, 1.66% done config: NAME STATE READ WRITE CKSUM zpool DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 label/e0s0 ONLINE 0 0 0 label/e1s0 ONLINE 0 0 0 label/e2s0 ONLINE 0 0 0 label/e0s1 ONLINE 0 0 0 label/e1s1 ONLINE 0 0 0 label/e2s1 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 label/e0s2 ONLINE 0 0 0 label/e1s2 ONLINE 0 0 0 label/e2s2 ONLINE 0 0 0 label/e0s3 ONLINE 0 0 0 label/e1s3 ONLINE 0 0 0 label/e2s3 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 label/e0s4 ONLINE 0 0 0 label/e1s4 ONLINE 0 0 0 label/e2s4 ONLINE 0 0 0 label/e0s5 ONLINE 0 0 0 label/e1s5 ONLINE 0 0 0 label/e2s5 ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 label/e0s6 ONLINE 0 0 0 label/e1s6 ONLINE 0 0 0 label/e2s6 ONLINE 0 0 0 label/e0s7 ONLINE 0 0 0 label/e1s7 ONLINE 0 0 0 label/e2s7 ONLINE 0 0 0 raidz2-4 ONLINE 0 0 0 label/e0s8 ONLINE 0 0 0 label/e1s8 ONLINE 0 0 0 label/e2s8 ONLINE 0 0 0 label/e0s9 ONLINE 0 0 0 label/e1s9 ONLINE 0 0 0 label/e2s9 ONLINE 0 0 0 raidz2-5 DEGRADED 0 0 0 label/e0s10 ONLINE 0 0 0 label/e1s10 ONLINE 0 0 0 label/e2s10 ONLINE 0 0 0 label/e0s11 ONLINE 0 0 0 spare-4 UNAVAIL 0 0 0 replacing-0 UNAVAIL 0 0 0 spare-0 UNAVAIL 0 0 0 9796387366129075446 UNAVAIL 0 0 0 was /dev/label/e1s11/old label/spare1 ONLINE 0 0 0 (resilvering) label/e1s11 ONLINE 0 0 0 (resilvering) label/spare0 ONLINE 0 0 0 label/e2s11 ONLINE 0 0 0 raidz2-6 ONLINE 0 0 0 label/e0s12 ONLINE 0 0 0 label/e1s12 ONLINE 0 0 0 label/e2s12 ONLINE 0 0 0 label/e0s13 ONLINE 0 0 0 label/e1s13 ONLINE 0 0 0 label/e2s13 ONLINE 0 0 0 raidz2-7 ONLINE 0 0 0 label/e0s14 ONLINE 0 0 0 label/e1s14 ONLINE 0 0 0 label/e2s14 ONLINE 0 0 0 label/e0s15 ONLINE 0 0 0 label/e1s15 ONLINE 0 0 0 label/e2s15 ONLINE 0 0 0 raidz2-8 ONLINE 0 0 0 label/e0s16 ONLINE 0 0 0 label/e1s16 ONLINE 0 0 0 label/e2s16 ONLINE 0 0 0 label/e0s17 ONLINE 0 0 0 label/e1s17 ONLINE 0 0 0 label/e2s17 ONLINE 0 0 0 logs mirror-9 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 spares 12553822586401141982 INUSE was /dev/label/spare0 15637882846021217179 INUSE was /dev/label/spare1 errors: No known data errors
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?96665eb8-9ff6-9050-bd7d-7f9f1d3fe737>