Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Mar 2012 19:53:24 +0100
From:      Peter Maloney <peter.maloney@brockmann-consult.de>
To:        freebsd-fs@freebsd.org
Subject:   Re: ZFS file corruption problem
Message-ID:  <4F5F97A4.6070000@brockmann-consult.de>
In-Reply-To: <4F5F7116.3020400@intellasoft.net>
References:  <4F5F7116.3020400@intellasoft.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Am 13.03.2012 17:08, schrieb Mark Murawski:
> So I have this zpool with corrupted files running on freebsd 9-release
> amd64.  The corrupted files can go away, that's not a big deal
>
> Here's the problem.
>
> $ ls -al /storage/zfs/0-Pics/2012-03-01-peterskill/155CANON/IMG_5576.CR2
> <infinite wait ensues>
Since this is one of the corrupt files, I guess ZFS would like to block
until it can return a good copy (such as if you put the mirror disk back
in)... so to fix this, you need to remove the file or restore from
backup (or add that mirror disk back in, which I will assume you can't):

rm /storage/zfs/0-Pics/2012-03-01-peterskill/155CANON/IMG_5576.CR2
(in the case of a file that should exist empty instead of being removed,
eg. a log where the log writer does not have write permission to the
directory, do touch also)

or maybe this works:

mv /somewhere_with_backup/IMG_5576.CR2 
/storage/zfs/0-Pics/2012-03-01-peterskill/155CANON/IMG_5576.CR2


And if there are more errors, you probably need to scrub to expand the
pool or for "zpool clear" to work.

> <pool is now entirely unusable, all file access results in an infinite
> block>
>
> No errors in dmesg, the process is now stuck in the D state, and is
> also unkillable.  A clean shutdown is also not possible as trying to
> kill processes using the pool and access the pool to unmount it will
> block.
>
> What's a good starting point to resolve this problem?
>
> Also... note that this was happening even before I started playing
> with the guids to try and get the pool back up.
>
> Here's the scenario that lead up to the problem:
>
> mirror-1 was consisting of a 120gig drive and an 80 gig drive
> I shut down, physically replaced the 80 with a 500, zpool attached it
> to /dev/ada2 as a mirror, and let zpool resilver.
>
> Resilver completed, I physically replaced the 120 with a 500, zpool
> attached it to the new 500 and waited for a resilver.
>
> Due to bugs in the promise sata300 tx4 drivers, the resilver started
> having problems, one of the 500's dropped out of the pool with
>
> ata2: timeout waiting to issue command
> ata2: error issuing ATA_IDENTIFY command
> ata2: SIGNATURE: ffffffff
> ata2: timeout waiting to issue command
> ata2: error issuing ATA_IDENTIFY command
> ata2: SIGNATURE: ffffffff
>
> Upon reboot I now had corrupted files.  The pool auto expanded and now
> I can't re attach the 80 or 120 to recover corrupted files.
>
> So the main problem is the total usability of the pool when hitting a
> corrupted file.
>
>
>   pool: zstorage
>  state: DEGRADED
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: http://www.sun.com/msg/ZFS-8000-8A
>  scan: resilvered 22.5G in 0h47m with 29682 errors on Mon Mar 12
> 02:20:56 2012
> config:
>
>         NAME                      STATE     READ WRITE CKSUM
>         zstorage                  DEGRADED 29.0K     0     0
>           mirror-0                ONLINE       0     0     0
>             ada4                  ONLINE       0     0     0
>             ada3                  ONLINE       0     0     0
>           mirror-1                DEGRADED 58.0K     0     0
>             17331410140726386358  UNAVAIL      0     0     0  was
> /dev/ada1s4
>             ada2                  ONLINE       0     0 58.0K
>           mirror-2                ONLINE       0     0     0
>             ada5                  ONLINE       0     0     0
>             ada10                 ONLINE       0     0     0
>           mirror-3                DEGRADED     0     0     0
>             14693115181240286208  REMOVED      0     0     0  was
> /dev/ada6
>             ada8                  ONLINE       0     0     0
>           mirror-4                DEGRADED     0     0     0
>             ada7                  ONLINE       0     0     0
>             83782446513674500     REMOVED      0     0     0  was
> /dev/ada9
>
>
> errors: Permanent errors have been detected in the following files:
>
>         /storage/zfs/0-Pics/2012-03-01-peterskill/155CANON/IMG_5576.CR2
>         /storage/zfs/Johns Stuff/gallery/._IMG_1225.psd
>         /storage/zfs/Johns Stuff/gallery/._IMG_1226.psd
>         /storage/zfs/Johns Stuff/gallery/._IMG_1243.psd
>         /storage/zfs/Johns Stuff/gallery/._a.jpg
>         ...etc, and 500 more
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F5F97A4.6070000>