Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Oct 2012 05:04:44 -0700
From:      Dennis Glatting <freebsd@penx.com>
To:        Paul Wootton <paul-freebsd@fletchermoorland.co.uk>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS RaidZ-2 problems
Message-ID:  <1351598684.88435.19.camel@btw.pki2.com>
In-Reply-To: <508F98F9.3040604@fletchermoorland.co.uk>
References:  <508F98F9.3040604@fletchermoorland.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 2012-10-30 at 09:08 +0000, Paul Wootton wrote:
> Hi,
> 
> I have had lots of bad luck with SATA drives and have had them fail on 
> me far too often. Started with a 3 drive RAIDZ and lost 2 drives at the 
> same time. Upgraded to a 6 drive RAIDZ and lost 2 drives with in hours 
> of each other and finally had a 9 drive RAIDZ (1 parity) and lost 
> another 2 drives (as luck would happen, this time I had a 90% backup on 
> another machine so did not loose everything). I finally decided that I 
> should switch to a RAIDZ2 (my current setup).
> Now I have lost 1 drive and the pack is showing as faulted. I have tried 
> exporting and reimporting, but that did not help either.
> Is this normal? Has any one got any ideas as to what has happened and why?
> 
> The fault this time might be cabling so I might not have lost the data, 
> but my understanding was that with RAIDZ-2, you could loose 2 drives and 
> still have a working pack.
> 
> I do still have the 90% backup of the pool and nothing has really 
> changed since that backup, so if someone wants me to try something and 
> it blows the pack away, it's not the end of the world.
> 

I've had this problem too. Here is what I can tell you for my case.

In the first system I have four arrays: two RAID1 by an Areca 1880i card
and two RAIDz2 through a LSI 9211-8i (IT) card and the MB (Gigabyte
X58A-UD7). One of the RAIDz2 arrays notoriously faulted and I lost the
array several times. I replaced the card, the cable, and the disks
themselves leaving only one other possibility -- the power supply.

The faulting array was on a separate cable from the power supply. I
replaced the power supply, going from a 1,000W to 1,300W, and the power
cables to the disks. Not a problem since.

In four other systems, including one where I've lost 30% of the disks in
less than a year, I have downgraded the operating system from stable/9
to stable/8 on two and installed CentOS 6.3 ZFS-on-Linux on another (the
last system is still running stable/9, for now). These systems
experience heavy load (compute and disk) and so far (less than two
weeks) all of my problems have gone away.

On two of those systems, which ran for over four days before a power
event, each generated 10TB of data and successfully scrubbed after the
power event. That simply wasn't possible previously for approximately
five months.

What is interesting is three smaller systems running stable/9 with four
disk RAIDz arrays have not had the same problems but all of their disks
are through their MBs and they do not experience the same loading as the
others.

YMMV


> 
> Cheers
> Paul
> 
> 
> pool: storage
> state: FAULTED
> status: One or more devices could not be opened.  There are insufficient
>          replicas for the pool to continue functioning.
> action: Attach the missing device and online it using 'zpool online'.
>     see: http://illumos.org/msg/ZFS-8000-3C
>    scan: resilvered 30K in 0h0m with 0 errors on Sun Oct 14 12:52:45 2012
> config:
> 
>          NAME                      STATE     READ WRITE CKSUM
>          storage                   FAULTED      0     0     1
>            raidz2-0                FAULTED      0     0     6
>              ada0                  ONLINE       0     0     0
>              ada1                  ONLINE       0     0     0
>              ada2                  ONLINE       0     0     0
>              17777811927559723424  UNAVAIL      0     0     0  was /dev/ada3
>              ada4                  ONLINE       0     0     0
>              ada5                  ONLINE       0     0     0
>              ada6                  ONLINE       0     0     0
>              ada7                  ONLINE       0     0     0
>              ada8                  ONLINE       0     0     0
>              ada10p4               ONLINE       0     0     0
> 
> root@filekeeper:/storage # zpool export storage
> root@filekeeper:/storage # zpool import storage
> cannot import 'storage': I/O error
>          Destroy and re-create the pool from
>          a backup source.
> 
> root@filekeeper:/usr/home/paul # uname -a
> FreeBSD filekeeper.caspersworld.co.uk 10.0-CURRENT FreeBSD 10.0-CURRENT 
> #0 r240967: Thu Sep 27 08:01:24 UTC 2012     
> root@filekeeper.caspersworld.co.uk:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1351598684.88435.19.camel>