Date: Sat, 10 Sep 2016 21:34:13 -0600 From: Warner Losh <imp@bsdimp.com> To: Christoph Pilka <c.pilka@asconix.com> Cc: freebsd-fs@freebsd.org Subject: Re: Server with 40 physical cores, 48 NVMe disks, feel free to test it Message-ID: <CANCZdfoAgHjqgcSMQBbpDBXkA8D7zdceZgnDVvMCPeM0Psg98Q@mail.gmail.com> In-Reply-To: <C6904B7F-D148-47C0-BD17-0A2AF63B5717@asconix.com> References: <C6904B7F-D148-47C0-BD17-0A2AF63B5717@asconix.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Sep 10, 2016 at 2:58 AM, Christoph Pilka <c.pilka@asconix.com> wrot= e: > Hi, > > we've just been granted a short-term loan of a server from Supermicro wit= h 40 physical cores (plus HTT) and 48 NVMe drives. After a bit of mucking a= bout, we managed to get 11-RC running. A couple of things are preventing th= e system from being terribly useful: > > - We have to use hw.nvme.force_intx=3D1 for the server to boot > If we don't, it panics around the 9th NVMe drive with "panic: couldn't fi= nd an APIC vector for IRQ...". Increasing hw.nvme.min_cpus_per_ioq brings i= t further, but it still panics later in the NVMe enumeration/init. hw.nvme.= per_cpu_io_queues=3D0 causes it to panic later (I suspect during ixl init -= the box has 4x10gb ethernet ports). John Baldwin has patches that help fix this. > - zfskern seems to be the limiting factor when doing ~40 parallel "dd if= =3D/dev/zer of=3D<file> bs=3D1m" on a zpool stripe of all 48 drives. Each d= rive shows ~30% utilization (gstat), I can do ~14GB/sec write and 16 read. > > - direct writing to the NVMe devices (dd from /dev/zero) gives about 550M= B/sec and ~91% utilization per device These are slow drives then if all they can do 600MB/s. The drives we're looking at do 3.2GB/s read and 1.6GB/s write from the drives that we're looking at. 48 drives though. Woof. What's the interconnect? There's enough PCIe lanes for that? 192 lanes? How's that possible? > Obviously, the first item is the most troublesome. The rest is based on e= ntirely synthetic testing and may have little or no actual impact on the se= rver's usability or fitness for our purposes. > > There is nothing but sshd running on the server, and if anyone wants to p= lay around you'll have IPMI access (remote kvm, virtual media, power) and r= oot. Don't think I have enough time to track this all down... Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfoAgHjqgcSMQBbpDBXkA8D7zdceZgnDVvMCPeM0Psg98Q>