FreeBSD Mail Archives

Date:      Tue, 13 Mar 2012 12:08:54 -0400
From:      Mark Murawski <markm-lists@intellasoft.net>
To:        freebsd-fs@freebsd.org
Subject:   ZFS file corruption problem
Message-ID:  <4F5F7116.3020400@intellasoft.net>

index | next in thread | raw e-mail


So I have this zpool with corrupted files running on freebsd 9-release 
amd64.  The corrupted files can go away, that's not a big deal

Here's the problem.

$ ls -al /storage/zfs/0-Pics/2012-03-01-peterskill/155CANON/IMG_5576.CR2
<infinite wait ensues>
<pool is now entirely unusable, all file access results in an infinite 
block>

No errors in dmesg, the process is now stuck in the D state, and is also 
unkillable.  A clean shutdown is also not possible as trying to kill 
processes using the pool and access the pool to unmount it will block.

What's a good starting point to resolve this problem?

Also... note that this was happening even before I started playing with 
the guids to try and get the pool back up.

Here's the scenario that lead up to the problem:

mirror-1 was consisting of a 120gig drive and an 80 gig drive
I shut down, physically replaced the 80 with a 500, zpool attached it to 
/dev/ada2 as a mirror, and let zpool resilver.

Resilver completed, I physically replaced the 120 with a 500, zpool 
attached it to the new 500 and waited for a resilver.

Due to bugs in the promise sata300 tx4 drivers, the resilver started 
having problems, one of the 500's dropped out of the pool with

ata2: timeout waiting to issue command
ata2: error issuing ATA_IDENTIFY command
ata2: SIGNATURE: ffffffff
ata2: timeout waiting to issue command
ata2: error issuing ATA_IDENTIFY command
ata2: SIGNATURE: ffffffff

Upon reboot I now had corrupted files.  The pool auto expanded and now I 
can't re attach the 80 or 120 to recover corrupted files.

So the main problem is the total usability of the pool when hitting a 
corrupted file.


   pool: zstorage
  state: DEGRADED
status: One or more devices has experienced an error resulting in data
         corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
         entire pool from backup.
    see: http://www.sun.com/msg/ZFS-8000-8A
  scan: resilvered 22.5G in 0h47m with 29682 errors on Mon Mar 12 
02:20:56 2012
config:

         NAME                      STATE     READ WRITE CKSUM
         zstorage                  DEGRADED 29.0K     0     0
           mirror-0                ONLINE       0     0     0
             ada4                  ONLINE       0     0     0
             ada3                  ONLINE       0     0     0
           mirror-1                DEGRADED 58.0K     0     0
             17331410140726386358  UNAVAIL      0     0     0  was 
/dev/ada1s4
             ada2                  ONLINE       0     0 58.0K
           mirror-2                ONLINE       0     0     0
             ada5                  ONLINE       0     0     0
             ada10                 ONLINE       0     0     0
           mirror-3                DEGRADED     0     0     0
             14693115181240286208  REMOVED      0     0     0  was /dev/ada6
             ada8                  ONLINE       0     0     0
           mirror-4                DEGRADED     0     0     0
             ada7                  ONLINE       0     0     0
             83782446513674500     REMOVED      0     0     0  was /dev/ada9


errors: Permanent errors have been detected in the following files:

         /storage/zfs/0-Pics/2012-03-01-peterskill/155CANON/IMG_5576.CR2
         /storage/zfs/Johns Stuff/gallery/._IMG_1225.psd
         /storage/zfs/Johns Stuff/gallery/._IMG_1226.psd
         /storage/zfs/Johns Stuff/gallery/._IMG_1243.psd
         /storage/zfs/Johns Stuff/gallery/._a.jpg
         ...etc, and 500 more

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F5F7116.3020400>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation