Date: Sat, 18 May 2013 13:29:58 +0000 From: Ivailo Tanusheff <Ivailo.Tanusheff@skrill.com> To: Paul Kraus <paul@kraus-haus.org> Cc: Liste FreeBSD <freebsd-questions@freebsd.org> Subject: RE: ZFS install on a partition Message-ID: <b179c20ebde742358e2cc52a1f04133e@DB3PR07MB059.eurprd07.prod.outlook.com> In-Reply-To: <A9599DD7-1A32-4607-BC83-2E6E4D03C560@kraus-haus.org> References: <F744BBF1-D98C-47BF-9546-14D1A9CB3733@todoo.biz> <372082cab2064846809615a8073e022c@DB3PR07MB059.eurprd07.prod.outlook.com> <A9599DD7-1A32-4607-BC83-2E6E4D03C560@kraus-haus.org>
next in thread | previous in thread | raw e-mail | index | archive | help
The software RAID depends not only from the disks, but also from the change= s on the OS, which will occur more frequently than an update of the firmwar= e of the raid controller. So that makes the hardware raid more stable and r= eliable. Also the resources of the hardware raid are exclusively used by the raid co= ntroller, which is not true for a software raid. So I do not get your point of appointing that a software raid is same/bette= r than the hardware one. About the second part - I point over both stability and reliability. Having= a spare disk reduces the risk as the recovery operation will start as soon= as a disk fails. It may sound paranoid, but still the possibility of a fai= ling disk which is detected after 8, 12 or even 24 hours is pretty big. Not sure about your calculations, hope you trust them, but in my previous c= ompany we have a 3-4 months period when a disk fails almost every day on 2 = year old servers, so trust me - I do NOT trust those calculations, as I've = seen the opposite. Maybe it was a failed batch of disk, shipped in the coun= try, but no one is insured against this. Yes, you can use several hot spare= s on the software raid, but: 1. You still depend on the problems, related to the OS. 2. If you read what the mate asking has written - you will see that is not = possible for him. I agree on the mentioned about recovering bid chunks of data, that's why I = suggested that he uses several smaller LUNs for the zpool. Best regards, Ivailo Tanusheff -----Original Message----- From: owner-freebsd-questions@freebsd.org [mailto:owner-freebsd-questions@f= reebsd.org] On Behalf Of Paul Kraus Sent: Saturday, May 18, 2013 4:02 PM To: Ivailo Tanusheff Cc: Liste FreeBSD Subject: Re: ZFS install on a partition On May 18, 2013, at 3:21 AM, Ivailo Tanusheff <Ivailo.Tanusheff@skrill.com>= wrote: > If you use HBA/JBOD then you will rely on the software RAID of the ZFS sy= stem. Yes, this RAID is good, but unless you use SSD disks to boost perform= ance and a lot of RAM the hardware raid should be more reliable and mush fa= ster. Why will the hardware raid be more reliable ? While hardware raid is susce= ptible to uncorrectable errors from the physical drives (hardware raid cont= rollers rely on the drives to report bad reads and writes), and the uncorre= ctable error rate for modern drives is such that with high capacity drives = (1TB and over) you are almost certain to run into a couple over the operati= onal life of the drive. 10^-14 for cheap drives and 10^-15 for better drive= s, very occasionally I see a drive rated for 10^-16. Run the math and see h= ow many TB worth of data you have to write and read (remember these failure= s are generally read failures with NO indication that a failure occurred, b= ad data is just returned to the system). In terms of performance HW raid is faster, generally due to the cache RAM = built into the HW raid controller. ZFS makes good use of system, RAM for th= e same function. An SSD can help with performance if the majority of writes= are sync (NFS is a good example of this) or if you can benefit from a much= larger read cache. SSDs are deployed with ZFS as either write LOG devices = (in which case they should be mirrored), but they only come into play for S= YNC writes; and as an extension of the ARC, the L2ARC, which does not have = to be mirrored as it is only a cache of existing data for spying up reads. > I didn't get if you want to use the system to dual boot Linux/FreeBSD or = just to share FreeBSD space with linux. > But I would advise you to go with option 1 - you will get most of the sys= tem and obviously you don't need zpool with raid, as your LSI controller wi= ll do all the redundancy for you. Making software RAID over the hardware on= e will only decrease performance and will NOT increase the reliability, as = you will not be sure which information is stored on which physical disk. >=20 > If stability is a MUST, then I will also advise you to go with bunch of p= ools and a disk designated as hot spare - in case some disk dies you will r= ely on the automation recovery. Also you should run monitoring tool on your= raid controller. I think you misunderstand the difference between stability and reliability= . Any ZFS configuration I have tried on FreeBSD is STABLE, having redundant= vdevs (mirrors or RAIDz<n>) along with hot spares can increase RELIABILITY= . The only advantage to having a hot spare is that when a drive fails (and = they all fail eventually), the REPLACE operation can start immediately with= out you noticing and manually replacing the failed drive. Reliability is a combination of reduction in MTBF (mean time between failu= re) and MTTR (mean time to repair). Having a hot spare reduces the MTTR. Th= e other way to improve MTTR is to go with smaller drives to recede the time= it takes the system to resilver a failed drive. This is NOT applicable in = the OP's situation. I try very hard not so use drives larger than 1TB becau= se resilver times can be days. Resilver time also depends on the total size= of the the data in a zpool, as a resolver operation walks the FS in time, = replaying all the writes and confirming that all the data on disk is good (= it does not actually rewrite the data unless it finds bad data). This means= a couple things, the first of which is that the resilver time will be depe= ndent on the amount of data you have written, not the capacity. A zppol wit= h a capacity of multiple TB will resilver in seconds if there is only a few= hundred MB written to it. Since the resilver operation is not just a block= by block copy, but a replay, it is I/Ops limited not bandwidth limited. You might be abl= e to stream sequential data from a drive at hundreds of MB/sec., but most S= ATA drives will not sustain more than one to two hundred RANDOM I/Ops (sequ= entially they can do much more). > You can also set copies=3D2/3 just in case some errors occur, so ZFS can = auto0repair the data. if you run ZFS over several LUNs this will make even = more sense.=20 -- Paul Kraus Deputy Technical Director, LoneStarCon 3 Sound Coordinator, Schenectady Lig= ht Opera Company _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman= /listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org= "
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b179c20ebde742358e2cc52a1f04133e>