From owner-freebsd-fs@FreeBSD.ORG Sun Dec 7 10:32:46 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 89B3BE62 for ; Sun, 7 Dec 2014 10:32:46 +0000 (UTC) Received: from hades.sorbs.net (hades.sorbs.net [67.231.146.201]) by mx1.freebsd.org (Postfix) with ESMTP id 7605CF22 for ; Sun, 7 Dec 2014 10:32:46 +0000 (UTC) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; CHARSET=US-ASCII Received: from isux.com (firewall.isux.com [213.165.190.213]) by hades.sorbs.net (Oracle Communications Messaging Server 7.0.5.29.0 64bit (built Jul 9 2013)) with ESMTPSA id <0NG700FBMK5XRQ00@hades.sorbs.net> for freebsd-fs@freebsd.org; Sun, 07 Dec 2014 02:37:10 -0800 (PST) Message-id: <54842CC5.2020604@sorbs.net> Date: Sun, 07 Dec 2014 11:32:37 +0100 From: Michelle Sullivan User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.24) Gecko/20100301 SeaMonkey/1.1.19 To: Will Andrews Subject: Re: ZFS weird issue... References: <54825E70.20900@sorbs.net> In-reply-to: Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Dec 2014 10:32:46 -0000 Will Andrews wrote: > On Fri, Dec 5, 2014 at 6:40 PM, Michelle Sullivan wrote: > >> Days later new drive to replace the dead drive arrived and was >> inserted. System refused to re-add as there was data in the cache, so >> rebooted and cleared the cache (as per many on web faq's) Reconfigured >> it to match the others. Can't do a zpool replace mfid8 because that's >> already in the pool... (was mfid9) can't use mfid15 because zpool >> reports it's not part of the config... can't use the uniq-id it received >> (can't find vdev) ... HELP!! :) >> > [...] > >> root@colossus:~ # zpool status -v >> > [...] > >> pool: sorbs >> state: DEGRADED >> status: One or more devices could not be opened. Sufficient replicas >> exist for >> the pool to continue functioning in a degraded state. >> action: Attach the missing device and online it using 'zpool online'. >> see: http://illumos.org/msg/ZFS-8000-2Q >> scan: scrub in progress since Fri Dec 5 17:11:29 2014 >> 2.51T scanned out of 29.9T at 89.4M/s, 89h7m to go >> 0 repaired, 8.40% done >> config: >> >> NAME STATE READ WRITE CKSUM >> sorbs DEGRADED 0 0 0 >> raidz2-0 DEGRADED 0 0 0 >> mfid0 ONLINE 0 0 0 >> mfid1 ONLINE 0 0 0 >> mfid2 ONLINE 0 0 0 >> mfid3 ONLINE 0 0 0 >> mfid4 ONLINE 0 0 0 >> mfid5 ONLINE 0 0 0 >> mfid6 ONLINE 0 0 0 >> mfid7 ONLINE 0 0 0 >> spare-8 DEGRADED 0 0 0 >> 1702922605 UNAVAIL 0 0 0 was /dev/mfid8 >> mfid14 ONLINE 0 0 0 >> mfid8 ONLINE 0 0 0 >> mfid9 ONLINE 0 0 0 >> mfid10 ONLINE 0 0 0 >> mfid11 ONLINE 0 0 0 >> mfid12 ONLINE 0 0 0 >> mfid13 ONLINE 0 0 0 >> spares >> 933862663 INUSE was /dev/mfid14 >> >> errors: No known data errors >> root@colossus:~ # uname -a >> FreeBSD colossus.sorbs.net 9.2-RELEASE FreeBSD 9.2-RELEASE #0 r255898: >> Thu Sep 26 22:50:31 UTC 2013 >> root@bake.isc.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 >> > [...] > >> root@colossus:~ # ls -l /dev/mfi* >> crw-r----- 1 root operator 0x22 Dec 5 17:18 /dev/mfi0 >> crw-r----- 1 root operator 0x68 Dec 5 17:18 /dev/mfid0 >> crw-r----- 1 root operator 0x69 Dec 5 17:18 /dev/mfid1 >> crw-r----- 1 root operator 0x78 Dec 5 17:18 /dev/mfid10 >> crw-r----- 1 root operator 0x79 Dec 5 17:18 /dev/mfid11 >> crw-r----- 1 root operator 0x7a Dec 5 17:18 /dev/mfid12 >> crw-r----- 1 root operator 0x82 Dec 5 17:18 /dev/mfid13 >> crw-r----- 1 root operator 0x83 Dec 5 17:18 /dev/mfid14 >> crw-r----- 1 root operator 0x84 Dec 5 17:18 /dev/mfid15 >> crw-r----- 1 root operator 0x6a Dec 5 17:18 /dev/mfid2 >> crw-r----- 1 root operator 0x6b Dec 5 17:18 /dev/mfid3 >> crw-r----- 1 root operator 0x6c Dec 5 17:18 /dev/mfid4 >> crw-r----- 1 root operator 0x6d Dec 5 17:18 /dev/mfid5 >> crw-r----- 1 root operator 0x6e Dec 5 17:18 /dev/mfid6 >> crw-r----- 1 root operator 0x75 Dec 5 17:18 /dev/mfid7 >> crw-r----- 1 root operator 0x76 Dec 5 17:18 /dev/mfid8 >> crw-r----- 1 root operator 0x77 Dec 5 17:18 /dev/mfid9 >> root@colossus:~ # >> > > Hi, > > From the above it appears your replacement drive's current name is > mfid15, and the spare is now mfid14. > No, I think LD8 was re-created but nothing was re-numbered... the following seems to confirm that (if I'm reading it right.) > What commands did you run that failed? Can you provide a copy of the > first label from 'zdb -l /dev/mfid0'? > root@colossus:~ # zdb -l /dev/mfid0 -------------------------------------------- LABEL 0 -------------------------------------------- version: 5000 name: 'sorbs' state: 0 txg: 979499 pool_guid: 1038563320 hostid: 339509314 hostname: 'colossus.sorbs.net' top_guid: 386636424 guid: 2060345993 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 386636424 nparity: 2 metaslab_array: 33 metaslab_shift: 38 ashift: 9 asize: 45000449064960 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 2060345993 path: '/dev/mfid0' phys_path: '/dev/mfid0' whole_disk: 1 DTL: 154 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 61296476 path: '/dev/mfid1' phys_path: '/dev/mfid1' whole_disk: 1 DTL: 153 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 1565205219 path: '/dev/mfid2' phys_path: '/dev/mfid2' whole_disk: 1 DTL: 152 create_txg: 4 children[3]: type: 'disk' id: 3 guid: 1876923630 path: '/dev/mfid3' phys_path: '/dev/mfid3' whole_disk: 1 DTL: 151 create_txg: 4 children[4]: type: 'disk' id: 4 guid: 1068158627 path: '/dev/mfid4' phys_path: '/dev/mfid4' whole_disk: 1 DTL: 150 create_txg: 4 children[5]: type: 'disk' id: 5 guid: 1726238716 path: '/dev/mfid5' phys_path: '/dev/mfid5' whole_disk: 1 DTL: 149 create_txg: 4 children[6]: type: 'disk' id: 6 guid: 390028842 path: '/dev/mfid6' phys_path: '/dev/mfid6' whole_disk: 1 DTL: 148 create_txg: 4 children[7]: type: 'disk' id: 7 guid: 1094656850 path: '/dev/mfid7' phys_path: '/dev/mfid7' whole_disk: 1 DTL: 147 create_txg: 4 children[8]: type: 'spare' id: 8 guid: 1773868765 whole_disk: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 1702922605 path: '/dev/mfid8' phys_path: '/dev/mfid8' whole_disk: 1 DTL: 166 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 933862663 path: '/dev/mfid14' phys_path: '/dev/mfid14' whole_disk: 1 is_spare: 1 DTL: 146 create_txg: 4 resilvering: 1 children[9]: type: 'disk' id: 9 guid: 1771170870 path: '/dev/mfid8' phys_path: '/dev/mfid8' whole_disk: 1 DTL: 145 create_txg: 4 children[10]: type: 'disk' id: 10 guid: 1797981023 path: '/dev/mfid9' phys_path: '/dev/mfid9' whole_disk: 1 DTL: 144 create_txg: 4 children[11]: type: 'disk' id: 11 guid: 1424656624 path: '/dev/mfid10' phys_path: '/dev/mfid10' whole_disk: 1 DTL: 143 create_txg: 4 children[12]: type: 'disk' id: 12 guid: 1908699165 path: '/dev/mfid11' phys_path: '/dev/mfid11' whole_disk: 1 DTL: 142 create_txg: 4 children[13]: type: 'disk' id: 13 guid: 396147269 path: '/dev/mfid12' phys_path: '/dev/mfid12' whole_disk: 1 DTL: 141 create_txg: 4 children[14]: type: 'disk' id: 14 guid: 847844383 path: '/dev/mfid13' phys_path: '/dev/mfid13' whole_disk: 1 DTL: 140 create_txg: 4 features_for_read: > The label will provide you with the full vdev guid that you need to > replace the original drive with a new one. > > Another thing you could do is wait for the spare to finish > resilvering, then promote it to replace the original drive, and make > your new one a spare. Considering the time required to resilver this > pool configuration, that may be preferable for you. > > --Will. > 2 physical paths of mfid8 ... that can't be good... can't seem to use guids. Michelle -- Michelle Sullivan http://www.mhix.org/