Date: Mon, 21 Jan 2019 15:11:09 +0100 From: Borja Marcos <borjam@sarenet.es> To: jdelisle <jdelisle@gmail.com> Cc: Maciej Jan Broniarz <gausus@gausus.net>, freebsd-fs@freebsd.org Subject: Re: ZFS on Hardware RAID Message-ID: <8761FDAC-3D47-4827-A0E5-A4F34C4C3BAE@sarenet.es> In-Reply-To: <CAMdBLfQc6pWtYv2JTZsBT9HuQ1xbfvAjO0PXQLUzObUF367A4g@mail.gmail.com> References: <1180280695.63420.1547910313494.JavaMail.zimbra@gausus.net> <92646202.63422.1547910433715.JavaMail.zimbra@gausus.net> <CAMdBLfQc6pWtYv2JTZsBT9HuQ1xbfvAjO0PXQLUzObUF367A4g@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 19 Jan 2019, at 20:29, jdelisle <jdelisle@gmail.com> wrote: >=20 > You'll find a lot of strong opinions on this topic over at the FreeNAS > forums. I too am wish an authoritative, knowledgeable SME would = answer > and thoroughly explain the inner workings and the risks involved. = Most of > the FreeNAS forum posts on this topic devolve into hand-waving and = blurry > incomplete explanations that end in statements like "trust me, don't = do > it". I'd love to understand why not. I'm curious and eager to learn = more. Alright, let me try with one of the many reasons to avoid =E2=80=9Chardwar= e RAIDs=E2=80=9D. Disk redundancy in a RAID system offers several benefits. It can not = only detect data corruption (and not all systems are equally good at that) but it = can also help repair it. Of course with adequate redundancy you can survive the loss = of one or several drives. But as I said not all corruption detection schemes are born equal. ZFS = has a very=20 sophisticated and effective one, developed as an answer to the = increasing size of the storage systems and the files. Everything has become so large, = the probability of an unnoticed corrupted block is quite high. Some storage systems have = suffered from silent data corruption for instance. Common =E2=80=9Chardware based RAID systems=E2=80=9D, which means = =E2=80=9Csoftware running on a small, embedded processor=E2=80=9D usually have quite limited checksum = systems. ZFS has a much more robust one. So, now let=E2=80=99s assume we are setting up a server and we have two = choices: use the HBA in =E2=80=9Chardware RAID=E2=80=9D mode, or use it just as a common HBA = relying on ZFS for redundancy. Option 1: Hardware RAID. Which is the preferred option by many people = because, well,=20 =E2=80=9Chardware=E2=80=9D sounds more reliable. Option 2: ZFS using disks, period. I refuse to use the JBOD term because = it=E2=80=99s an added layer of confusion over what should be a simple subject.=20 Now let=E2=80=99s imagine that there=E2=80=99s some data corruption on = one of the disks. The corruption is not detected by the hardware RAID but it=E2=80=99s promptly detected by = the more elaborate ZFS checksum scheme.=20 If we chose option 1, ZFS will let us know that there is a corrupted = file. But because redundancy was provided only by the underlying =E2=80=9Chardware RAID=E2=80= =9D ZFS won=E2=80=99t have the=20 ability to heal anything.=20 Had we chosen option 2, however, and assuming that there was some = redundancy, ZFS would not only report a data corruption incident, but it would return = correct data unless=20 the blocks were corrupted in several of the disks.=20 Assuming that ZFS has a much better error correction/detection than most = =E2=80=9Chardware RAID=E2=80=9D options (except the high end storage subsystems), running = ZFS on a logical volume built on a =E2=80=9Chardware RAID=E2=80=9D is roughly equivalent to = running it on a single disk with no redundancy. At least you won=E2=80=99t get a real benefit of the better = recovery mechanisms offered by ZFS. Do you want another reason? If you use a "hardware RAID=E2=80=9D = solution you are stuck with it. In case you suffer a controller failure you will need the same hardware = to recover. With ZFS you can move the disks to a different system with a different HBA. As long as = ZFS can access the disks without any odd stuff it will work regardless of the hardware = manufacturer. There are other important performance reasons related to the handling of = data and metadata but I think that the first reason I mentioned (ZFS error recovery and = healing capabilities) is a strong enough motivation to avoid =E2=80=9Chardware RAIDs=E2=80=9D Borja.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8761FDAC-3D47-4827-A0E5-A4F34C4C3BAE>