Date: Fri, 29 Oct 2010 11:34:41 -0700 From: Rumen Telbizov <telbizov@gmail.com> To: Artem Belevich <fbsdlist@src.cx> Cc: freebsd-stable@freebsd.org Subject: Re: Degraded zpool cannot detach old/bad drive Message-ID: <AANLkTimD_f1pZHy7cq4jA%2BSZwdQRmotndSiukpNvwi6Y@mail.gmail.com> In-Reply-To: <AANLkTinc1yQrwVsf%2Bk9LW5J50twbtcQ-d1SV_rny06Su@mail.gmail.com> References: <AANLkTi=EWfVyZjKEYe=c0x6QvsdUcHGo2-iqGr4OaVG7@mail.gmail.com> <AANLkTinjfpnHGMvzJ5Ly8_WFXGvQmQ4D0-_TgbVBi=cf@mail.gmail.com> <AANLkTi=h6ZJtbRHeUOpKX17uOD5_XyYmu01ZTTCCKw=_@mail.gmail.com> <AANLkTikPqgoxuYp7D88Dp0t5LvjXQeO3mCXdFw6onEZN@mail.gmail.com> <AANLkTimMM82=rqMQQfZZYTcaM_CU%2B01xPeZUGAik8H3v@mail.gmail.com> <AANLkTinKpMLeJOd_V7uxyAFqcStoGwV9PfTJDLDPq3By@mail.gmail.com> <AANLkTiktrL7LHkh3HLGqZeZx7ve6arBrs8ZE57NwtfN1@mail.gmail.com> <AANLkTinc1yQrwVsf%2Bk9LW5J50twbtcQ-d1SV_rny06Su@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Artem, everyone, Thanks once again for your feedback and help. Here's more information. # zpool export tank # ls /dev/gpt disk-e1:s10 disk-e1:s11 disk-e1:s12 disk-e1:s13 disk-e1:s14 disk-e1:s15 disk-e1:s16 disk-e1:s18 disk-e1:s19 disk-e1:s20 disk-e1:s21 disk-e1:s22 disk-e1:s23 disk-e1:s3 disk-e1:s4 disk-e1:s5 disk-e1:s6 disk-e1:s7 disk-e1:s8 disk-e1:s9 disk-e2:s0 disk-e2:s1 disk-e2:s10 disk-e2:s11 disk-e2:s2 disk-e2:s3 disk-e2:s4 disk-e2:s5 disk-e2:s6 disk-e2:s7 disk-e2:s8 disk-e2:s9 newdisk-e1:s17 newdisk-e1:s2 All the disks are here! Same for /dev/gptid/. Now importing the pool back like you suggested: # zpool import -d /dev/gpt pool: tank id: 13504509992978610301 state: UNAVAIL status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. see: http://www.sun.com/msg/ZFS-8000-5E config: tank UNAVAIL insufficient replicas raidz1 ONLINE gpt/disk-e1:s10 ONLINE mfid9p1 ONLINE mfid10p1 ONLINE mfid11p1 ONLINE It's missing a ton of drives. kern.geom.label.gptid.enable=0 makes no difference either And if I import it normally I get the same result as before. The pool is imported OK but with most of the disks referred to as mfidXXX instead of /dev/gpt/disk-XX and here's what I have left: # ls /dev/gpt disk-e1:s10 disk-e1:s20 disk-e2:s0 The problem I think comes down to what I have written in the zpool.cache file. It stores the mfid path instead of the gpt/disk one. children[0] type='disk' id=0 guid=1641394056824955485 * path='/dev/mfid33p1'* * phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0/sd@1,0:a'* whole_disk=0 * DTL=55* * * Compared to a disk from a partner server which is fine: children[0] type='disk' id=0 guid=5513814503830705577 path='/dev/gpt/disk-e1:s6' whole_disk=0 * * *I suspect OpenSolaris overwrote that part. So I wonder if there's way to actually* *edit the /boot/zfs/zpool.cache file and replace path with the corresponding /dev/gpt* *entry and remove the **phys_path **one. I don't know about DTL? Is there a way * to do this and how stupid that idea sounds to you? They should still point to the same data after all? * I cannot find a good zdb tutorial so this is what I've got for now: * # zdb tank version=14 name='tank' state=0 txg=206266 pool_guid=13504509992978610301 hostid=409325918 hostname='XXXX' vdev_tree type='root' id=0 guid=13504509992978610301 children[0] type='raidz' id=0 guid=3740854890192825394 nparity=1 metaslab_array=33 metaslab_shift=36 ashift=9 asize=7995163410432 is_log=0 children[0] type='disk' id=0 guid=1641394056824955485 * path='/dev/mfid33p1'* * phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@1,0:a'* whole_disk=0 * DTL=55* children[1] type='disk' id=1 guid=6047192237176807561 path='/dev/mfid1p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@2,0:a' whole_disk=0 DTL=250 children[2] type='disk' id=2 guid=9178318500891071208 path='/dev/mfid2p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@3,0:a' whole_disk=0 DTL=249 children[3] type='disk' id=3 guid=2567999855746767831 path='/dev/mfid3p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@4,0:a' whole_disk=0 DTL=248 children[1] type='raidz' id=1 guid=17097047310177793733 nparity=1 metaslab_array=31 metaslab_shift=36 ashift=9 asize=7995163410432 is_log=0 children[0] type='disk' id=0 guid=14513380297393196654 path='/dev/mfid4p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@5,0:a' whole_disk=0 DTL=266 children[1] type='disk' id=1 guid=7673391645329839273 path='/dev/mfid5p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@6,0:a' whole_disk=0 DTL=265 children[2] type='disk' id=2 guid=15189132305590412134 path='/dev/mfid6p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@7,0:a' whole_disk=0 DTL=264 children[3] type='disk' id=3 guid=17171875527714022076 path='/dev/mfid7p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@8,0:a' whole_disk=0 DTL=263 children[2] type='raidz' id=2 guid=4551002265962803186 nparity=1 metaslab_array=30 metaslab_shift=36 ashift=9 asize=7995163410432 is_log=0 children[0] type='disk' id=0 guid=12104241519484712161 path='/dev/gpt/disk-e1:s10' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@9,0:a' whole_disk=0 DTL=262 children[1] type='disk' id=1 guid=3950210349623142325 path='/dev/mfid9p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@a,0:a' whole_disk=0 DTL=261 children[2] type='disk' id=2 guid=14559903955698640085 path='/dev/mfid10p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@b,0:a' whole_disk=0 DTL=260 children[3] type='disk' id=3 guid=12364155114844220066 path='/dev/mfid11p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@c,0:a' whole_disk=0 DTL=259 children[3] type='raidz' id=3 guid=12517231224568010294 nparity=1 metaslab_array=29 metaslab_shift=36 ashift=9 asize=7995163410432 is_log=0 children[0] type='disk' id=0 guid=7655789038925330983 path='/dev/mfid12p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@d,0:a' whole_disk=0 DTL=258 children[1] type='disk' id=1 guid=17815755378968233141 path='/dev/mfid13p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@e,0:a' whole_disk=0 DTL=257 children[2] type='disk' id=2 guid=9590421681925673767 path='/dev/mfid14p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@f,0:a' whole_disk=0 DTL=256 children[3] type='disk' id=3 guid=13312724999073057440 path='/dev/mfid34p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@10,0:a' whole_disk=0 DTL=60 children[4] type='raidz' id=4 guid=7622366288306613136 nparity=1 metaslab_array=28 metaslab_shift=36 ashift=9 asize=7995163410432 is_log=0 children[0] type='disk' id=0 guid=11283483106921343963 path='/dev/mfid15p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@11,0:a' whole_disk=0 DTL=254 children[1] type='disk' id=1 guid=14900597968455968576 path='/dev/mfid16p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@12,0:a' whole_disk=0 DTL=253 children[2] type='disk' id=2 guid=4140592611852504513 path='/dev/gpt/disk-e1:s20' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@13,0:a' whole_disk=0 DTL=252 children[3] type='disk' id=3 guid=2794215380207576975 path='/dev/mfid18p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@14,0:a' whole_disk=0 DTL=251 children[5] type='raidz' id=5 guid=17655293908271300889 nparity=1 metaslab_array=27 metaslab_shift=36 ashift=9 asize=7995163410432 is_log=0 children[0] type='disk' id=0 guid=5274146379037055039 path='/dev/mfid19p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@15,0:a' whole_disk=0 DTL=278 children[1] type='disk' id=1 guid=8651755019404873686 path='/dev/mfid20p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@16,0:a' whole_disk=0 DTL=277 children[2] type='disk' id=2 guid=16827379661759988976 path='/dev/gpt/disk-e2:s0' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@17,0:a' whole_disk=0 DTL=276 children[3] type='disk' id=3 guid=2524967151333933972 path='/dev/mfid22p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@18,0:a' whole_disk=0 DTL=275 children[6] type='raidz' id=6 guid=2413519694016115220 nparity=1 metaslab_array=26 metaslab_shift=36 ashift=9 asize=7995163410432 is_log=0 children[0] type='disk' id=0 guid=16361968944335143412 path='/dev/mfid23p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@19,0:a' whole_disk=0 DTL=274 children[1] type='disk' id=1 guid=10054650477559530937 path='/dev/mfid24p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@1a,0:a' whole_disk=0 DTL=273 children[2] type='disk' id=2 guid=17105959045159531558 path='/dev/mfid25p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@1b,0:a' whole_disk=0 DTL=272 children[3] type='disk' id=3 guid=17370453969371497663 path='/dev/mfid26p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@1c,0:a' whole_disk=0 DTL=271 children[7] type='raidz' id=7 guid=4614010953103453823 nparity=1 metaslab_array=24 metaslab_shift=36 ashift=9 asize=7995163410432 is_log=0 children[0] type='disk' id=0 guid=10090128057592036175 path='/dev/mfid27p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@1d,0:a' whole_disk=0 DTL=270 children[1] type='disk' id=1 guid=16676544025008223925 path='/dev/mfid28p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@1e,0:a' whole_disk=0 DTL=269 children[2] type='disk' id=2 guid=11777789246954957292 path='/dev/mfid29p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@1f,0:a' whole_disk=0 DTL=268 children[3] type='disk' id=3 guid=3406600121427522915 path='/dev/mfid30p1' phys_path='/pci@0,0/pci8086,3b42@1c/pci15d9,c480@0 /sd@20,0:a' whole_disk=0 DTL=267 Your help is highly appreciated. Thanks you very much, Rumen Telbizov On Fri, Oct 29, 2010 at 12:26 AM, Artem Belevich <fbsdlist@src.cx> wrote: > On Thu, Oct 28, 2010 at 10:51 PM, Rumen Telbizov <telbizov@gmail.com> > wrote: > > Hi Artem, everyone, > > > > Thanks for your quick response. Unfortunately I already did try this > > approach. > > Applying -d /dev/gpt only limits the pool to the bare three remaining > disks > > which turns > > pool completely unusable (no mfid devices). Maybe those labels are > removed > > shortly > > they are being tried to be imported/accessed? > > In one of the previous emails you've clearly listed many devices in > /dev/gpt and said that they've disappeared after pool import. > Did you do "zpool import -d /dev/gpt" while /dev/gpt entries were present? > > > What I don't understand is what exactly makes those gpt labels disappear > > when the pool is imported and otherwise are just fine?! > > This is the way GEOM works. If something (ZFS in this case) uses raw > device, derived GEOM entities disappear. > > Try exporting the pool. Your /dev/gpt entries should be back. Now try > to import with -d option and see if it works. > > You may try bringing the labels back the hard way by detaching raw > drive and then re-attaching it via the label, but resilvering one > drive at a time will take a while. > > --Artem > -- Rumen Telbizov http://telbizov.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimD_f1pZHy7cq4jA%2BSZwdQRmotndSiukpNvwi6Y>