Date: Thu, 10 Jul 2014 15:31:00 +0300 From: Alexander Motin <mav@FreeBSD.org> To: Kashyap Desai <kashyap.desai@avagotech.com> Cc: FreeBSD-scsi <freebsd-scsi@freebsd.org> Subject: Re: SSDs peformance on head/freebsd-10 stable using FIO Message-ID: <53BE8784.8060503@FreeBSD.org> In-Reply-To: <8fbe38cdad1e66717a9de7fdf63812c2@mail.gmail.com> References: <8fbe38cdad1e66717a9de7fdf63812c2@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, Kashyap. On 10.07.2014 15:00, Kashyap Desai wrote: > I am trying to collect IOPs and throughput using FIO on FreeBSD-10-stable > as below post mentioned that CAM can reach upto 1,000,000 IOPS using > Fine-Grained CAM locking. > > http://www.freebsd.org/news/status/report-2013-07-2013-09.html#GEOM-Direct-Dispatch-and-Fine-Grained-CAM-Locking > > I am using below FIO parameter. > > [global] > ioengine=posixaio > buffered=0 > rw=randread > bs=4K > iodepth=32 > numjobs=2 > direct=1 > runtime=60s > thread > group_reporting=1 > [job1] > filename=/dev/da0 > [job2] > filename=/dev/da1 > [job3] > filename=/dev/da2 > [job4] > filename=/dev/da3 > [job4] > filename=/dev/da4 > .. > > I have 8 SSDs in my setup and all 8 SSDs are behind LSI’s 12Gp/s MegaRaid > Controller as JBOD. I also found FIO can be used in Async mode after > loading “aio” kernel module. > > Using single SSD, I am able to see 110K-130K IOPs. This IOPs counts are > matching with what I see on Linux machine. > > Now, I am not able to scale IOPs on my machine after 200K. I see CPU is > almost occupied and no idle time after IOPs reach to 200K. > > If you have any pointers to try with, I can do some experiment on my setup. Getting such results I would immediately start doing profiling with pmcstat. Quite likely you are hitting some new lock congestion. Start with simple `pmcstat -n 100000000 -TS unhalted-cycles`. It it hard to say for sure what went wrong there without more data, so just couple thoughts: First of all, I've never tried aio in my benchmarks, only synchronous ones. Try to run 8 instances of `dd if=/dev/daX of=/dev/null bs=512` per each SSD same time, just as I did. You may vary number of dd's, but keep total below 256, or you mad to increase nswbuf limit in kern_vfs_bio_buffer_alloc(). For second, you are using single HBA, that should create significant congestion around its CAM SIM lock. Proper solution would be to add multiple queues support to the driver, and we discussed it with Scott Long for quite some time, but that requires more work (I hope you may be interested in it ;) ). Or you may just insert 3-4 HBAs. My million IOPS I was reaching with four 2008/2308 6Gbps HBAs and 16 SATA SSDs. Please tell me what you get, so we could continue investigation. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53BE8784.8060503>