Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 Feb 2019 07:55:16 -0700
From:      Alan Somers <asomers@freebsd.org>
To:        Rebecca Cran <rebecca@bluestop.org>
Cc:        Rajesh Kumar <rajfbsd@gmail.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: Any ideal way to run FIO benchmarking for NVMEe devices in FreeBSD
Message-ID:  <CAOtMX2h%2BvF-3DrySvHrHWZrSBA6nCQjaKb5vYJC=ebEfzELpEw@mail.gmail.com>
In-Reply-To: <e8f62043-3e1d-707b-a496-366e02ffdecf@bluestop.org>
References:  <CAAO%2BANM34aY4g%2BFjPdt8F2sNo5e6N2dZdTDKavEJwvRbNJz=Gw@mail.gmail.com> <e8f62043-3e1d-707b-a496-366e02ffdecf@bluestop.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Feb 22, 2019 at 5:29 AM Rebecca Cran via freebsd-hackers
<freebsd-hackers@freebsd.org> wrote:
>
> On 2/22/19 1:51 AM, Rajesh Kumar wrote:
> >     1. Should we use "posixaio" as the ioengine (or) something else?
> >     2. Should we use single thread (or) multiple threads for test? If
> >     multiple threads, how can we decide on the optimal thread count?
> >     3. Should we use "raw device files" (Eg: nvme namespace file -
> >     /dev/nvme0ns1) without filesystem (or) use a mounted filesystem with a
> >     regular file (Eg: /mnt/nvme/test1). Looks like raw device files give better
> >     numbers.
> >     4. Should we use a shared file (or) one file per thread?
> >     5. I believe 1Job should be fine for benchmarking. (or) should we try
> >     multiple jobs?
>
>
> I just ran a quick test on a filesystem on my machine which has an M.2
> NVMe drive, and it seems posixaio performs pretty poorly compared to the
> sync ioengine: around 700 MB/s vs. 1100 MB/s!

When AIO is run on a filesystem, it uses an internal thread pool to
process requests.  But if you run it on a bare drive, then the I/O is
direct and should be faster than the sync ioengine.

-Alan

>
> I _was_ going to suggest using posixaio and setting iodepth to something
> like 32, but since it performs badly I'd suggest playing around with the
> numjobs parameter and seeing where the best performance is achieved -
> whether that's latency or throughput.
>
>
> On my system, single-threaded achieves ~530 MB/s, 8 jobs/threads 1150
> MB/s and 32 1840 MB/s with a 4 KB block size.
>
> Bumping the block size from 4 KB to 16 KB makes the throughput more
> jumpy, but appears to average 2300 MB/s when used with 32 jobs.
>
>
> --
> Rebecca Cran
>
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2h%2BvF-3DrySvHrHWZrSBA6nCQjaKb5vYJC=ebEfzELpEw>