Date: Tue, 01 Feb 2011 10:17:47 -0500 From: Mike Tancsa <mike@sentex.net> To: Adam Vande More <amvandemore@gmail.com>, freebsd-fs@freebsd.org Subject: Re: ZFS help! (solved) Message-ID: <4D48241B.2040807@sentex.net> In-Reply-To: <AANLkTi=3Betpki=uDkH7vc0jNOEOuT7R5pphCzUROH-O@mail.gmail.com> References: <4D43475D.5050008@sentex.net> <4D44D775.50507@jrv.org> <4D470A65.4050000@sentex.net> <AANLkTi=Z=Onduz9uMuoRgJNXEUJeNKU%2BWw=Rgi8TP2tP@mail.gmail.com> <4D471729.3050804@sentex.net> <AANLkTi=3Betpki=uDkH7vc0jNOEOuT7R5pphCzUROH-O@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/31/2011 3:32 PM, Adam Vande More wrote: >> > > maybe the meta data stuff is stored above it in /tank1/? I don't know. I'm > pretty sure you can use a newer version of ZFS to rewind the transaction > groups until you get back to a good state, but there's probably a lot in > this scenario that would prevent that from being a viable solution. If you > do get it resolved please post the resolution. OK, to summarize what happened for the archives. This is RELENG_8 (from end of Jan, on AMD64, 8G of RAM) On my DR backup server that has backups of backups, I decided to expand an existing pool. I added a new eSata cage with integrated PM 2011-01-28.11:45:43 zpool add tank1 raidz /dev/ada0 /dev/ada1 /dev/ada2 /dev/ada3 0(offsite)# camcontrol devlist <WDC WD1001FALS-00J7B1 05.00K05> at scbus0 target 0 lun 0 (pass0,ada0) <WDC WD1001FALS-00J7B1 05.00K05> at scbus0 target 1 lun 0 (pass1,ada1) <WDC WD1001FALS-00J7B1 05.00K05> at scbus0 target 2 lun 0 (pass2,ada2) <WDC WD1001FALS-00J7B1 05.00K05> at scbus0 target 3 lun 0 (pass3,ada3) <Port Multiplier 47261095 1f06> at scbus0 target 15 lun 0 (pass4,pmp0) <WDC WD2001FASS-00U0B0 01.00101> at scbus1 target 0 lun 0 (pass5,ada4) <WDC WD1501FASS-00W2B0 05.01D05> at scbus1 target 1 lun 0 (pass6,ada5) <WDC WD1501FASS-00W2B0 05.01D05> at scbus1 target 2 lun 0 (pass7,ada6) <WDC WD1501FASS-00W2B0 05.01D05> at scbus1 target 3 lun 0 (pass8,ada7) <WDC WD1501FASS-00W2B0 05.01D05> at scbus1 target 4 lun 0 (pass9,ada8) <Port Multiplier 47261095 1f06> at scbus1 target 15 lun 0 (pass10,pmp1) 0(offsite)# Controller is an Sil3134 (siis and ahci drivers) Shortly after bringing the new sets of drives online, the drive cage failed and started to present the drives in some odd way where the label on the drives was no longer there. # zdb -l /dev/ada0 -------------------------------------------- LABEL 0 -------------------------------------------- failed to unpack label 0 -------------------------------------------- LABEL 1 -------------------------------------------- failed to unpack label 1 -------------------------------------------- LABEL 2 -------------------------------------------- failed to unpack label 2 -------------------------------------------- LABEL 3 -------------------------------------------- failed to unpack label 3 # zpool status -v pool: tank1 state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM tank1 UNAVAIL 0 0 0 insufficient replicas raidz1 ONLINE 0 0 0 ad0 ONLINE 0 0 0 ad1 ONLINE 0 0 0 ad4 ONLINE 0 0 0 ad6 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ada4 ONLINE 0 0 0 ada5 ONLINE 0 0 0 ada6 ONLINE 0 0 0 ada7 ONLINE 0 0 0 raidz1 UNAVAIL 0 0 0 insufficient replicas ada0 UNAVAIL 0 0 0 cannot open ada1 UNAVAIL 0 0 0 cannot open ada2 UNAVAIL 0 0 0 cannot open ada3 UNAVAIL 0 0 0 cannot open Pulling the drives out and putting them in a new drive cage allowed me to see the file system as being online, albeit with errors. Next steps were to delete the 2 problem files On bootup, it looked like zpool status -v pool: tank1 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM tank1 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad0 ONLINE 0 0 0 ad1 ONLINE 0 0 0 ad4 ONLINE 0 0 0 ad6 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada3 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ada5 ONLINE 0 0 0 ada8 ONLINE 0 0 0 ada7 ONLINE 0 0 0 ada6 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /tank1/argus-data/previous/argus-sites-radium.2011.01.28.16.00 tank1/argus-data:<0xc6> /tank1/argus-data/argus-sites-radium Killed those files via rm, and then zpool status -v shows errors: Permanent errors have been detected in the following files: tank1/argus-data:<0xc5> tank1/argus-data:<0xc6> tank1/argus-data:<0xc7> So started a scrub and once it was done, no errors and all is clean! 0(offsite)# zpool status pool: tank1 state: ONLINE scrub: scrub completed after 7h32m with 0 errors on Mon Jan 31 23:00:46 2011 config: NAME STATE READ WRITE CKSUM tank1 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad0 ONLINE 0 0 0 ad1 ONLINE 0 0 0 ad4 ONLINE 0 0 0 ad6 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada3 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ada5 ONLINE 0 0 0 ada8 ONLINE 0 0 0 ada7 ONLINE 0 0 0 ada6 ONLINE 0 0 0 errors: No known data errors 0(offsite)# ---Mike
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D48241B.2040807>