Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Apr 2023 09:43:31 -0700
From:      Freddie Cash <fjwcash@gmail.com>
To:        egoitz@ramattack.net, Freebsd fs <freebsd-fs@freebsd.org>,  Freebsd hackers <freebsd-hackers@freebsd.org>, freebsd-hardware@freebsd.org
Subject:   Re: M2 NVME support
Message-ID:  <CAOjFWZ5fzd8FRiS6n0jxY4LTAJdLo4TLBLP0TYWjf9nQQCrsAw@mail.gmail.com>
In-Reply-To: <ZDfpGHKmWWa0Qpn0@graf.pompo.net>
References:  <a0c12351e21588a8e767988e1367ae9f@ramattack.net> <ZDfpGHKmWWa0Qpn0@graf.pompo.net>

next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000069924a05f93a6d46
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

>
> Le jeu. 13 avr. 23 =C3=A0 13:25:36 +0200, egoitz@ramattack.net <
> egoitz@ramattack.net>
> > We are in the process of buying new hardware for use with FreeBSD and
> > ZFS. We are planning whether to buy M2 NVME disks or just SATA SSD disk=
s
> > (probably Samsung PM* ones). How is you experience with them?. Do you
> > recommend one over the another?. Is perhaps better support from some of
> > them from a specificic version to newer?. Or do they perhaps work bette=
r
> > with some specific disk controller?.
>

There were issues in the past where NVMe drives were "too fast" for ZFS,
and various bottlenecks were uncovered.  Most (all?) of those have been
fixed over the past couple years.  These issues were found on pools using
all NVMe drives in various configurations for data storage (multiple raidz
vdevs; multiple mirror vdevs).  This was back when PCIe 3.0 NVMe drives
were all the rage, or maybe when PCIe 4.0 drives first started appearing?

If you're running a recent release of FreeBSD (13.x) with the newer
versions of OpenZFS 2.x, then you shouldn't have any issues using NVMe
drives.  The hard part will be finding drives with MLC or 3D TLC NAND chips
in multiple channels, with a large SLC cache, and lots of RAM onboard using
good controllers, in order to get consistent, strong performance during
writes.  Especially when the drive is near full.  Too many drives are
moving to QLC NAND, or using DRAM-less controllers (using system RAM as a
buffer) in order to reduce the cost.  You'll want to do your research into
the technology used on the drive before buying any specific drive.

SATA SSDs will perform better than hard drives, but will be limited by the
SATA bus to around 550 MBps of read/write throughput.  NVMe drives will
provide multiple GBps of read/write throughput (depending on the drive and
PCIe bus version).  Finding a motherboard that supports more than 2 M.2
slots will be very hard.  If you want more than 2 drives, you'll have to
look into PCIe add-in boards with M.2 slots.  Really expensive ones will
include PCIe switches onboard so they'll work in pretty much any
motherboard with spare x16 slots (and maybe x8 slots, with reduced
performance?).  Less expensive add-in boards require PCIe bifurcation
support in the BIOS, and will only work in specific slots on the
motherboard.

My home ZFS server uses an ASUS motherboard with PCIe bifurcation support,
has an ASUS Hyper M.2 expansion card in the second PCIe x16 slot, with 2 WD
Blue M.2 SSDs installed (card supports 4 M.2 drives).  These are used to
create a root pool using a single mirror vdev.  /, /usr, and /var are
mounted from there.  There's 6x hard drives in a separate data pool using
multiple mirror vdevs, with /home mounted from there (this pool has been
migrated from IDE drives to SATA, from FreeBSD to Linux, and from raidz to
mirror vdevs at various points in the past, without losing any data so far;
yay ZFS!).

At work, all our ZFS servers use 2.5" SATA SSDs for the root pool, and for
separate L2ARC/SLOG devices, with 24-90 SATA hard drives for the storage
pool.  These are all running FreeBSD 13.x.

If you want the best performance, and money isn't a restriction, then
you'll want to look into servers that have U.2 (or whatever the next-gen
small form factor interface name is) slots and backplanes.  The drives cost
a lot more than regular M.2 SSDs, but provide a lot more performance.
Especially in AMD EPYC servers with 128 PCIe lanes to play with.  :)

--=20
Freddie Cash
fjwcash@gmail.com

--00000000000069924a05f93a6d46
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quot=
e" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204)=
;padding-left:1ex">Le jeu. 13 avr. 23 =C3=A0 13:25:36 +0200, <a href=3D"mai=
lto:egoitz@ramattack.net" target=3D"_blank">egoitz@ramattack.net</a> &lt;<a=
 href=3D"mailto:egoitz@ramattack.net" target=3D"_blank">egoitz@ramattack.ne=
t</a>&gt;<br>&gt; We are in the process of buying new hardware for use with=
 FreeBSD and<br>
&gt; ZFS. We are planning whether to buy M2 NVME disks or just SATA SSD dis=
ks<br>
&gt; (probably Samsung PM* ones). How is you experience with them?. Do you<=
br>
&gt; recommend one over the another?. Is perhaps better support from some o=
f<br>
&gt; them from a specificic version to newer?. Or do they perhaps work bett=
er<br>
&gt; with some specific disk controller?.<br></blockquote><div><br></div><d=
iv>There were issues in the past where NVMe drives were &quot;too fast&quot=
; for ZFS, and various bottlenecks were uncovered.=C2=A0 Most (all?) of tho=
se have been fixed over the past couple years.=C2=A0 These issues were foun=
d on pools using all NVMe drives in various configurations for data storage=
 (multiple raidz vdevs; multiple mirror vdevs).=C2=A0 This was back when PC=
Ie 3.0 NVMe drives were all the rage, or maybe when PCIe 4.0 drives first s=
tarted appearing?</div><div><br></div><div>If you&#39;re running a recent r=
elease of FreeBSD (13.x) with the newer versions of OpenZFS 2.x, then you s=
houldn&#39;t have any issues using NVMe drives.=C2=A0 The hard part will be=
 finding drives with MLC or 3D TLC NAND chips in multiple channels, with a =
large SLC cache, and lots of RAM onboard using good controllers, in order t=
o get consistent, strong performance during writes.=C2=A0 Especially when t=
he drive is near full.=C2=A0 Too many drives are moving to QLC NAND, or usi=
ng DRAM-less controllers (using system RAM as a buffer) in order to reduce =
the cost.=C2=A0 You&#39;ll want to do your research into the technology use=
d on the drive before buying any specific drive.</div><div><br></div><div>S=
ATA SSDs will perform better than hard drives, but will be limited by the S=
ATA bus to around 550 MBps of read/write throughput.=C2=A0 NVMe drives will=
 provide multiple GBps of read/write throughput (depending on the drive and=
 PCIe bus version).=C2=A0 Finding a motherboard that supports more than 2 M=
.2 slots will be very hard.=C2=A0 If you want more than 2 drives, you&#39;l=
l have to look into PCIe add-in boards with M.2 slots.=C2=A0 Really expensi=
ve ones will include PCIe switches onboard so they&#39;ll work in pretty mu=
ch any motherboard with spare x16 slots (and maybe x8 slots, with reduced p=
erformance?).=C2=A0 Less expensive add-in boards require PCIe bifurcation s=
upport in the BIOS, and will only work in specific slots on the motherboard=
.</div><div><br></div><div>My home ZFS server uses an ASUS motherboard with=
 PCIe bifurcation support, has an ASUS Hyper M.2 expansion card in the seco=
nd PCIe x16 slot, with 2 WD Blue M.2 SSDs installed (card supports 4 M.2 dr=
ives).=C2=A0 These are used to create a root pool using a single mirror vde=
v.=C2=A0 /, /usr, and /var are mounted from there.=C2=A0 There&#39;s 6x har=
d drives in a separate data pool using multiple mirror vdevs, with /home mo=
unted from there (this pool has been migrated from IDE drives to SATA, from=
 FreeBSD to Linux, and from raidz to mirror vdevs at various points in the =
past, without losing any data so far; yay ZFS!).</div><div><br></div><div>A=
t work, all our ZFS servers use 2.5&quot; SATA SSDs for the root pool, and =
for separate L2ARC/SLOG devices, with 24-90 SATA hard drives for the storag=
e pool.=C2=A0 These are all running FreeBSD 13.x.</div><div><br></div><div>=
If you want the best performance, and money isn&#39;t a restriction, then y=
ou&#39;ll want to look into servers that have U.2 (or whatever the next-gen=
 small form factor interface name is) slots and backplanes.=C2=A0 The drive=
s=C2=A0cost a lot more than regular M.2 SSDs, but provide a lot more perfor=
mance.=C2=A0 Especially in AMD EPYC servers with 128 PCIe lanes to play wit=
h.=C2=A0 :)</div></div><div><br></div><span class=3D"gmail_signature_prefix=
">-- </span><br><div dir=3D"ltr" class=3D"gmail_signature">Freddie Cash<br>=
<a href=3D"mailto:fjwcash@gmail.com" target=3D"_blank">fjwcash@gmail.com</a=
></div></div>

--00000000000069924a05f93a6d46--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOjFWZ5fzd8FRiS6n0jxY4LTAJdLo4TLBLP0TYWjf9nQQCrsAw>