Date: Thu, 13 Apr 2023 09:43:31 -0700 From: Freddie Cash <fjwcash@gmail.com> To: egoitz@ramattack.net, Freebsd fs <freebsd-fs@freebsd.org>, Freebsd hackers <freebsd-hackers@freebsd.org>, freebsd-hardware@freebsd.org Subject: Re: M2 NVME support Message-ID: <CAOjFWZ5fzd8FRiS6n0jxY4LTAJdLo4TLBLP0TYWjf9nQQCrsAw@mail.gmail.com> In-Reply-To: <ZDfpGHKmWWa0Qpn0@graf.pompo.net> References: <a0c12351e21588a8e767988e1367ae9f@ramattack.net> <ZDfpGHKmWWa0Qpn0@graf.pompo.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000069924a05f93a6d46 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > > Le jeu. 13 avr. 23 =C3=A0 13:25:36 +0200, egoitz@ramattack.net < > egoitz@ramattack.net> > > We are in the process of buying new hardware for use with FreeBSD and > > ZFS. We are planning whether to buy M2 NVME disks or just SATA SSD disk= s > > (probably Samsung PM* ones). How is you experience with them?. Do you > > recommend one over the another?. Is perhaps better support from some of > > them from a specificic version to newer?. Or do they perhaps work bette= r > > with some specific disk controller?. > There were issues in the past where NVMe drives were "too fast" for ZFS, and various bottlenecks were uncovered. Most (all?) of those have been fixed over the past couple years. These issues were found on pools using all NVMe drives in various configurations for data storage (multiple raidz vdevs; multiple mirror vdevs). This was back when PCIe 3.0 NVMe drives were all the rage, or maybe when PCIe 4.0 drives first started appearing? If you're running a recent release of FreeBSD (13.x) with the newer versions of OpenZFS 2.x, then you shouldn't have any issues using NVMe drives. The hard part will be finding drives with MLC or 3D TLC NAND chips in multiple channels, with a large SLC cache, and lots of RAM onboard using good controllers, in order to get consistent, strong performance during writes. Especially when the drive is near full. Too many drives are moving to QLC NAND, or using DRAM-less controllers (using system RAM as a buffer) in order to reduce the cost. You'll want to do your research into the technology used on the drive before buying any specific drive. SATA SSDs will perform better than hard drives, but will be limited by the SATA bus to around 550 MBps of read/write throughput. NVMe drives will provide multiple GBps of read/write throughput (depending on the drive and PCIe bus version). Finding a motherboard that supports more than 2 M.2 slots will be very hard. If you want more than 2 drives, you'll have to look into PCIe add-in boards with M.2 slots. Really expensive ones will include PCIe switches onboard so they'll work in pretty much any motherboard with spare x16 slots (and maybe x8 slots, with reduced performance?). Less expensive add-in boards require PCIe bifurcation support in the BIOS, and will only work in specific slots on the motherboard. My home ZFS server uses an ASUS motherboard with PCIe bifurcation support, has an ASUS Hyper M.2 expansion card in the second PCIe x16 slot, with 2 WD Blue M.2 SSDs installed (card supports 4 M.2 drives). These are used to create a root pool using a single mirror vdev. /, /usr, and /var are mounted from there. There's 6x hard drives in a separate data pool using multiple mirror vdevs, with /home mounted from there (this pool has been migrated from IDE drives to SATA, from FreeBSD to Linux, and from raidz to mirror vdevs at various points in the past, without losing any data so far; yay ZFS!). At work, all our ZFS servers use 2.5" SATA SSDs for the root pool, and for separate L2ARC/SLOG devices, with 24-90 SATA hard drives for the storage pool. These are all running FreeBSD 13.x. If you want the best performance, and money isn't a restriction, then you'll want to look into servers that have U.2 (or whatever the next-gen small form factor interface name is) slots and backplanes. The drives cost a lot more than regular M.2 SSDs, but provide a lot more performance. Especially in AMD EPYC servers with 128 PCIe lanes to play with. :) --=20 Freddie Cash fjwcash@gmail.com --00000000000069924a05f93a6d46 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quot= e" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204)= ;padding-left:1ex">Le jeu. 13 avr. 23 =C3=A0 13:25:36 +0200, <a href=3D"mai= lto:egoitz@ramattack.net" target=3D"_blank">egoitz@ramattack.net</a> <<a= href=3D"mailto:egoitz@ramattack.net" target=3D"_blank">egoitz@ramattack.ne= t</a>><br>> We are in the process of buying new hardware for use with= FreeBSD and<br> > ZFS. We are planning whether to buy M2 NVME disks or just SATA SSD dis= ks<br> > (probably Samsung PM* ones). How is you experience with them?. Do you<= br> > recommend one over the another?. Is perhaps better support from some o= f<br> > them from a specificic version to newer?. Or do they perhaps work bett= er<br> > with some specific disk controller?.<br></blockquote><div><br></div><d= iv>There were issues in the past where NVMe drives were "too fast"= ; for ZFS, and various bottlenecks were uncovered.=C2=A0 Most (all?) of tho= se have been fixed over the past couple years.=C2=A0 These issues were foun= d on pools using all NVMe drives in various configurations for data storage= (multiple raidz vdevs; multiple mirror vdevs).=C2=A0 This was back when PC= Ie 3.0 NVMe drives were all the rage, or maybe when PCIe 4.0 drives first s= tarted appearing?</div><div><br></div><div>If you're running a recent r= elease of FreeBSD (13.x) with the newer versions of OpenZFS 2.x, then you s= houldn't have any issues using NVMe drives.=C2=A0 The hard part will be= finding drives with MLC or 3D TLC NAND chips in multiple channels, with a = large SLC cache, and lots of RAM onboard using good controllers, in order t= o get consistent, strong performance during writes.=C2=A0 Especially when t= he drive is near full.=C2=A0 Too many drives are moving to QLC NAND, or usi= ng DRAM-less controllers (using system RAM as a buffer) in order to reduce = the cost.=C2=A0 You'll want to do your research into the technology use= d on the drive before buying any specific drive.</div><div><br></div><div>S= ATA SSDs will perform better than hard drives, but will be limited by the S= ATA bus to around 550 MBps of read/write throughput.=C2=A0 NVMe drives will= provide multiple GBps of read/write throughput (depending on the drive and= PCIe bus version).=C2=A0 Finding a motherboard that supports more than 2 M= .2 slots will be very hard.=C2=A0 If you want more than 2 drives, you'l= l have to look into PCIe add-in boards with M.2 slots.=C2=A0 Really expensi= ve ones will include PCIe switches onboard so they'll work in pretty mu= ch any motherboard with spare x16 slots (and maybe x8 slots, with reduced p= erformance?).=C2=A0 Less expensive add-in boards require PCIe bifurcation s= upport in the BIOS, and will only work in specific slots on the motherboard= .</div><div><br></div><div>My home ZFS server uses an ASUS motherboard with= PCIe bifurcation support, has an ASUS Hyper M.2 expansion card in the seco= nd PCIe x16 slot, with 2 WD Blue M.2 SSDs installed (card supports 4 M.2 dr= ives).=C2=A0 These are used to create a root pool using a single mirror vde= v.=C2=A0 /, /usr, and /var are mounted from there.=C2=A0 There's 6x har= d drives in a separate data pool using multiple mirror vdevs, with /home mo= unted from there (this pool has been migrated from IDE drives to SATA, from= FreeBSD to Linux, and from raidz to mirror vdevs at various points in the = past, without losing any data so far; yay ZFS!).</div><div><br></div><div>A= t work, all our ZFS servers use 2.5" SATA SSDs for the root pool, and = for separate L2ARC/SLOG devices, with 24-90 SATA hard drives for the storag= e pool.=C2=A0 These are all running FreeBSD 13.x.</div><div><br></div><div>= If you want the best performance, and money isn't a restriction, then y= ou'll want to look into servers that have U.2 (or whatever the next-gen= small form factor interface name is) slots and backplanes.=C2=A0 The drive= s=C2=A0cost a lot more than regular M.2 SSDs, but provide a lot more perfor= mance.=C2=A0 Especially in AMD EPYC servers with 128 PCIe lanes to play wit= h.=C2=A0 :)</div></div><div><br></div><span class=3D"gmail_signature_prefix= ">-- </span><br><div dir=3D"ltr" class=3D"gmail_signature">Freddie Cash<br>= <a href=3D"mailto:fjwcash@gmail.com" target=3D"_blank">fjwcash@gmail.com</a= ></div></div> --00000000000069924a05f93a6d46--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOjFWZ5fzd8FRiS6n0jxY4LTAJdLo4TLBLP0TYWjf9nQQCrsAw>