FreeBSD Mail Archives

Date:      Mon, 21 Jan 2019 15:11:09 +0100
From:      Borja Marcos <borjam@sarenet.es>
To:        jdelisle <jdelisle@gmail.com>
Cc:        Maciej Jan Broniarz <gausus@gausus.net>, freebsd-fs@freebsd.org
Subject:   Re: ZFS on Hardware RAID
Message-ID:  <8761FDAC-3D47-4827-A0E5-A4F34C4C3BAE@sarenet.es>
In-Reply-To: <CAMdBLfQc6pWtYv2JTZsBT9HuQ1xbfvAjO0PXQLUzObUF367A4g@mail.gmail.com>
References:  <1180280695.63420.1547910313494.JavaMail.zimbra@gausus.net> <92646202.63422.1547910433715.JavaMail.zimbra@gausus.net> <CAMdBLfQc6pWtYv2JTZsBT9HuQ1xbfvAjO0PXQLUzObUF367A4g@mail.gmail.com>

> On 19 Jan 2019, at 20:29, jdelisle <jdelisle@gmail.com> wrote:
>=20
> You'll find a lot of strong opinions on this topic over at the FreeNAS
> forums.   I too am wish an authoritative, knowledgeable SME would =
answer
> and thoroughly explain the inner workings and the risks involved.  =
Most of
> the FreeNAS forum posts on this topic devolve into hand-waving and =
blurry
> incomplete explanations that end in statements like "trust me, don't =
do
> it".  I'd love to understand why not.  I'm curious and eager to learn =
more.

Alright, let me try with one of the many reasons to avoid =E2=80=9Chardwar=
e RAIDs=E2=80=9D.

Disk redundancy in a RAID system offers several benefits. It can not =
only detect
data corruption (and not all systems are equally good at that) but it =
can also help
repair it. Of course with adequate redundancy you can survive the loss =
of one
or several drives.

But as I said not all corruption detection schemes are born equal. ZFS =
has a very=20
sophisticated and effective one, developed as an answer to the =
increasing size
of the storage systems and the files. Everything has become so large, =
the probability
of an unnoticed corrupted block is quite high. Some storage systems have =
suffered
from silent data corruption for instance.

Common =E2=80=9Chardware based RAID systems=E2=80=9D, which means =
=E2=80=9Csoftware running on
a small, embedded processor=E2=80=9D usually have quite limited checksum =
systems. ZFS has
a much more robust one.

So, now let=E2=80=99s assume we are setting up a server and we have two =
choices: use the HBA
in =E2=80=9Chardware RAID=E2=80=9D mode, or use it just as a common HBA =
relying on ZFS for redundancy.

Option 1: Hardware RAID. Which is the preferred option by many people =
because, well,=20
=E2=80=9Chardware=E2=80=9D sounds more reliable.

Option 2: ZFS using disks, period. I refuse to use the JBOD term because =
it=E2=80=99s an added
layer of confusion over what should be a simple subject.=20

Now let=E2=80=99s imagine that there=E2=80=99s some data corruption on =
one of the disks. The corruption is
not detected by the hardware RAID but it=E2=80=99s promptly detected by =
the more elaborate
ZFS checksum scheme.=20

If we chose option 1, ZFS will let us know that there is a corrupted =
file. But because
redundancy was provided only by the underlying =E2=80=9Chardware RAID=E2=80=
=9D ZFS won=E2=80=99t have the=20
ability to heal anything.=20

Had we chosen option 2, however, and assuming that there was some =
redundancy, ZFS
would not only report a data corruption incident, but it would return =
correct data unless=20
the blocks were corrupted in several of the disks.=20

Assuming that ZFS has a much better error correction/detection than most =
=E2=80=9Chardware
RAID=E2=80=9D options (except the high end storage subsystems), running =
ZFS on a logical volume
built on a =E2=80=9Chardware RAID=E2=80=9D is roughly equivalent to =
running it on a single disk with no
redundancy. At least you won=E2=80=99t get a real benefit of the better =
recovery mechanisms
offered by ZFS.

Do you want another reason? If you use a "hardware RAID=E2=80=9D =
solution you are stuck with it.
In case you suffer a controller failure you will need the same hardware =
to recover. With ZFS you can
move the disks to a different system with a different HBA. As long as =
ZFS can access the disks
without any odd stuff it will work regardless of the hardware =
manufacturer.

There are other important performance reasons related to the handling of =
data and metadata
but I think that the first reason I mentioned (ZFS error recovery and =
healing capabilities) is
a strong enough motivation to avoid =E2=80=9Chardware RAIDs=E2=80=9D

Borja.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8761FDAC-3D47-4827-A0E5-A4F34C4C3BAE>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation