Date: Thu, 27 Jun 2013 17:58:13 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Zoltan Arnold NAGY <zoltan.arnold.nagy@gmail.com> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS-backed NFS export with vSphere Message-ID: <1508973822.292566.1372370293123.JavaMail.root@uoguelph.ca> In-Reply-To: <CAGFYgbPoO8fqkZxCAgi-p24su=%2BLC4KPswntsNaRK-aCktmWuA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Zoltan Nagy wrote: > Hi list, > > I'd love to have a ZFS-backed NFS export as my VM datastore, but as > much as > I'd like to tune > it, the performance doesn't even get close to Solaris 11's. > > I currently have the system set up as this: > > pool: tank > state: ONLINE > scan: none requested > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > da0 ONLINE 0 0 0 > da1 ONLINE 0 0 0 > mirror-1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > logs > ada0p4 ONLINE 0 0 0 > spares > da4 AVAIL > > ada0 is a samsung 840pro SSD, which I'm using for system+ZIL. > daX is 1TB, 7200rpm seagate disks. > (from this test's perspective, if I use a separate ZIL device or just > a > partition, doesn't matter - I get roughly the same numbers). > > The first thing I noticed is that the FSINFO reply from FreeBSD is > advertising untunable values (I did not find them documented either > in the > manpage, or as a sysctl). > > rtmax, rtpref, wtmax, wtpref: 64k (fbsd), 1M (solaris) > dtpref: 64k (fbsd), 8k (solaris) > > After manually patching the nfs code (changing NFS_MAXBSIZE to 1M > instead > of MAXBSIZE) to adversize the same read/write values (didn't touch > dtpref), > my performance went up from 17MB/s to 76MB/s. > > Is there a reason NFS_MAXBSIZE is not tunable and/or is it so slow? > For exporting other file system types (UFS, ...) the buffer cache is used and MAXBSIZE is the largest block you can use for the buffer cache. Some increase of MAXBSIZE would be nice. (I've tried 128Kb without observing difficulties and from what I've been told 128Kb is the ZFS block size.) > Here's my iozone output (which is run on an ext4 partition created on > a > linux VM which has a disk backed by the NFS exported from the FreeBSD > box): > > Record Size 4096 KB > File size set to 2097152 KB > Command line used: iozone -b results.xls -r 4m -s 2g -t 6 -i 0 -i > 1 -i 2 > Output is in Kbytes/sec > Time Resolution = 0.000001 seconds. > Processor cache size set to 1024 Kbytes. > Processor cache line size set to 32 bytes. > File stride size set to 17 * record size. > Throughput test with 6 processes > Each process writes a 2097152 Kbyte file in 4096 Kbyte records > > Children see throughput for 6 initial writers = 76820.31 > KB/sec > Parent sees throughput for 6 initial writers = 74899.44 > KB/sec > Min throughput per process = 12298.62 KB/sec > Max throughput per process = 12972.72 KB/sec > Avg throughput per process = 12803.38 KB/sec > Min xfer = 1990656.00 KB > > Children see throughput for 6 rewriters = 76030.99 KB/sec > Parent sees throughput for 6 rewriters = 75062.91 KB/sec > Min throughput per process = 12620.45 KB/sec > Max throughput per process = 12762.80 KB/sec > Avg throughput per process = 12671.83 KB/sec > Min xfer = 2076672.00 KB > > Children see throughput for 6 readers = 114221.39 > KB/sec > Parent sees throughput for 6 readers = 113942.71 KB/sec > Min throughput per process = 18920.14 KB/sec > Max throughput per process = 19183.80 KB/sec > Avg throughput per process = 19036.90 KB/sec > Min xfer = 2068480.00 KB > > Children see throughput for 6 re-readers = 117018.50 KB/sec > Parent sees throughput for 6 re-readers = 116917.01 KB/sec > Min throughput per process = 19436.28 KB/sec > Max throughput per process = 19590.40 KB/sec > Avg throughput per process = 19503.08 KB/sec > Min xfer = 2080768.00 KB > > Children see throughput for 6 random readers = 110072.68 > KB/sec > Parent sees throughput for 6 random readers = 109698.99 > KB/sec > Min throughput per process = 18260.33 KB/sec > Max throughput per process = 18442.55 KB/sec > Avg throughput per process = 18345.45 KB/sec > Min xfer = 2076672.00 KB > > Children see throughput for 6 random writers = 76389.71 > KB/sec > Parent sees throughput for 6 random writers = 74816.45 > KB/sec > Min throughput per process = 12592.09 KB/sec > Max throughput per process = 12843.75 KB/sec > Avg throughput per process = 12731.62 KB/sec > Min xfer = 2056192.00 KB > > The other interesting this is that you can notice the system doesn't > cache > the data file to ram (the box has 32G), so even for re-reads I get > miserable numbers. With solaris, the re-reads happen at nearly wire > spead. > > Any ideas what else I could tune? While 76MB/s is much better than > the > original 17MB I was seeing, it's still far from Solaris's ~220MB/s... > > Thanks a lot, > Zoltan > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1508973822.292566.1372370293123.JavaMail.root>