Date: Wed, 1 Apr 2015 16:24:52 -0700 From: Jim Harris <jim.harris@gmail.com> To: Tobias Oberstein <tobias.oberstein@gmail.com> Cc: Konstantin Belousov <kostikbel@gmail.com>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, Michael Fuckner <michael@fuckner.net>, Alan Somers <asomers@freebsd.org> Subject: Re: NVMe performance 4x slower than expected Message-ID: <CAJP=Hc_6BFpoWqkSRyZaxsN1Zn=-D14CXOQjMb4zjnZRKhMb-g@mail.gmail.com> In-Reply-To: <551C6B62.7080205@gmail.com> References: <551BC57D.5070101@gmail.com> <CAOtMX2jVwMHSnQfphAF%2Ba2%2Bo7eLp62nHmUo4t%2BEahrXLWReaFQ@mail.gmail.com> <CAJP=Hc-RNVuhPePg7bnpmT4ByzyXs_CNvAs7Oy7ntXjqhZYhCQ@mail.gmail.com> <551C5A82.2090306@gmail.com> <20150401212303.GB2379@kib.kiev.ua> <CAJP=Hc87FMYCrQYGfAtefQ8PLT3WtnvPfPSppp3zRF-0noQR9Q@mail.gmail.com> <551C6B62.7080205@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 1, 2015 at 3:04 PM, Tobias Oberstein <tobias.oberstein@gmail.com > wrote: > Is this vmstat after the test ? >> > > No, it wasn't (I ran vmstat hours after the test). > > Here is right after test (shortened test duration, otherwise exactly the > same FIO config): > > https://github.com/oberstet/scratchbox/blob/master/ > freebsd/cruncher/results/freebsd_vmstat.md#nvd7 > > Somewhat funny is that nvme does not use MSI(X). >> >> >> Yes - this is exactly the problem. >> >> nvme does use MSI-X if it can allocate the vectors (one per core). With >> 48 cores, >> I suspect we are quickly running out of vectors, so NVMe is reverting to >> INTx. >> >> Could you actually send vmstat -ia (I left off the 'a' previously) - >> just so we can >> see all allocated interrupt vectors. >> >> As an experiment, can you try disabling hyperthreading - this will >> reduce the >> > > The CPUs in this box > > root@s4l-zfs:~/src/sys/amd64/conf # sysctl hw.model > hw.model: Intel(R) Xeon(R) CPU E7-8857 v2 @ 3.00GHz > > don't have hyperthreading (we deliberately selected CPU model for max. > clock rather than HT) > > http://ark.intel.com/products/75254/Intel-Xeon-Processor-E7- > 8857-v2-30M-Cache-3_00-GHz > > number of cores and should let you get MSI-X vectors allocated for at >> least >> the first couple of NVMe controllers. Then please re-run your performance >> test on one of those controllers. >> >> > You mean I should run against nvdN where N is a controller that still got > MSI-X while other controllers did not? > > How would I find out which controller N? I don't know which nvdN is > mounted in a PCIe slot directly assigned to which CPU socket, and I don't > know which one's still got MSI-X and which not. > vmstat -ia should show you which controllers were assigned per-core vectors - you'll see all of them in the irq256+ range instead of the single vector per controller you see now in the lower irq index range. > > I could arrange for disabling all but 1 CPU and retest. Would that help? > Yes - that would help. Depending on how your system is configured, and which CPU socket the NVMe controllers are attached to, you may need to keep 2 CPU sockets enabled. You can also try a debug tunable that is in the nvme driver. hw.nvme.per_cpu_io_queues=0 This would just try to allocate a single MSI-X vector per controller - so all cores would still share a single I/O queue pair, but it would be MSI-X instead of INTx. (This actually should be the first fallback if we cannot allocate per-core vectors). Would at least show we are able to allocate some number of MSI-X vectors for NVMe. > > === > > Right after running against nvd7 > > irq56: nvme0 6440 0 > ... > irq106: nvme7 145056 3 > > > Then, immediately thereafter, running against nvd0 > > https://github.com/oberstet/scratchbox/blob/master/ > freebsd/cruncher/results/freebsd_vmstat.md#nvd0 > > irq56: nvme0 9233 0 > ... > irq106: nvme7 145056 3 > > === > > Earlier this day, I ran multiple longer tests .. all against nvd7. So if > these are cumulative numbers since last boot, that would make sense. > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJP=Hc_6BFpoWqkSRyZaxsN1Zn=-D14CXOQjMb4zjnZRKhMb-g>