Date: Fri, 28 Jun 2013 00:16:48 +0200 From: Zoltan Arnold NAGY <zoltan.arnold.nagy@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS-backed NFS export with vSphere Message-ID: <CAGFYgbMgqPR=cJQ4-ik8K=rrKfPEE%2BW9m-LZVBS0LnYJXAEtgg@mail.gmail.com> In-Reply-To: <1508973822.292566.1372370293123.JavaMail.root@uoguelph.ca> References: <CAGFYgbPoO8fqkZxCAgi-p24su=%2BLC4KPswntsNaRK-aCktmWuA@mail.gmail.com> <1508973822.292566.1372370293123.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
Right. As I said, increasing it to 1M increased my throughput from 17MB/s to 76MB/s. However, the SSD can do much more random writes; any idea why I don't see the ZIL go over this value? (vSphere always uses sync writes). Thanks, Zoltan On Thu, Jun 27, 2013 at 11:58 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote: > Zoltan Nagy wrote: > > Hi list, > > > > I'd love to have a ZFS-backed NFS export as my VM datastore, but as > > much as > > I'd like to tune > > it, the performance doesn't even get close to Solaris 11's. > > > > I currently have the system set up as this: > > > > pool: tank > > state: ONLINE > > scan: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > tank ONLINE 0 0 0 > > mirror-0 ONLINE 0 0 0 > > da0 ONLINE 0 0 0 > > da1 ONLINE 0 0 0 > > mirror-1 ONLINE 0 0 0 > > da2 ONLINE 0 0 0 > > da3 ONLINE 0 0 0 > > logs > > ada0p4 ONLINE 0 0 0 > > spares > > da4 AVAIL > > > > ada0 is a samsung 840pro SSD, which I'm using for system+ZIL. > > daX is 1TB, 7200rpm seagate disks. > > (from this test's perspective, if I use a separate ZIL device or just > > a > > partition, doesn't matter - I get roughly the same numbers). > > > > The first thing I noticed is that the FSINFO reply from FreeBSD is > > advertising untunable values (I did not find them documented either > > in the > > manpage, or as a sysctl). > > > > rtmax, rtpref, wtmax, wtpref: 64k (fbsd), 1M (solaris) > > dtpref: 64k (fbsd), 8k (solaris) > > > > After manually patching the nfs code (changing NFS_MAXBSIZE to 1M > > instead > > of MAXBSIZE) to adversize the same read/write values (didn't touch > > dtpref), > > my performance went up from 17MB/s to 76MB/s. > > > > Is there a reason NFS_MAXBSIZE is not tunable and/or is it so slow? > > > For exporting other file system types (UFS, ...) the buffer cache is > used and MAXBSIZE is the largest block you can use for the buffer cache. > Some increase of MAXBSIZE would be nice. (I've tried 128Kb without > observing > difficulties and from what I've been told 128Kb is the ZFS block size.) > > > Here's my iozone output (which is run on an ext4 partition created on > > a > > linux VM which has a disk backed by the NFS exported from the FreeBSD > > box): > > > > Record Size 4096 KB > > File size set to 2097152 KB > > Command line used: iozone -b results.xls -r 4m -s 2g -t 6 -i 0 -i > > 1 -i 2 > > Output is in Kbytes/sec > > Time Resolution = 0.000001 seconds. > > Processor cache size set to 1024 Kbytes. > > Processor cache line size set to 32 bytes. > > File stride size set to 17 * record size. > > Throughput test with 6 processes > > Each process writes a 2097152 Kbyte file in 4096 Kbyte records > > > > Children see throughput for 6 initial writers = 76820.31 > > KB/sec > > Parent sees throughput for 6 initial writers = 74899.44 > > KB/sec > > Min throughput per process = 12298.62 KB/sec > > Max throughput per process = 12972.72 KB/sec > > Avg throughput per process = 12803.38 KB/sec > > Min xfer = 1990656.00 KB > > > > Children see throughput for 6 rewriters = 76030.99 KB/sec > > Parent sees throughput for 6 rewriters = 75062.91 KB/sec > > Min throughput per process = 12620.45 KB/sec > > Max throughput per process = 12762.80 KB/sec > > Avg throughput per process = 12671.83 KB/sec > > Min xfer = 2076672.00 KB > > > > Children see throughput for 6 readers = 114221.39 > > KB/sec > > Parent sees throughput for 6 readers = 113942.71 KB/sec > > Min throughput per process = 18920.14 KB/sec > > Max throughput per process = 19183.80 KB/sec > > Avg throughput per process = 19036.90 KB/sec > > Min xfer = 2068480.00 KB > > > > Children see throughput for 6 re-readers = 117018.50 KB/sec > > Parent sees throughput for 6 re-readers = 116917.01 KB/sec > > Min throughput per process = 19436.28 KB/sec > > Max throughput per process = 19590.40 KB/sec > > Avg throughput per process = 19503.08 KB/sec > > Min xfer = 2080768.00 KB > > > > Children see throughput for 6 random readers = 110072.68 > > KB/sec > > Parent sees throughput for 6 random readers = 109698.99 > > KB/sec > > Min throughput per process = 18260.33 KB/sec > > Max throughput per process = 18442.55 KB/sec > > Avg throughput per process = 18345.45 KB/sec > > Min xfer = 2076672.00 KB > > > > Children see throughput for 6 random writers = 76389.71 > > KB/sec > > Parent sees throughput for 6 random writers = 74816.45 > > KB/sec > > Min throughput per process = 12592.09 KB/sec > > Max throughput per process = 12843.75 KB/sec > > Avg throughput per process = 12731.62 KB/sec > > Min xfer = 2056192.00 KB > > > > The other interesting this is that you can notice the system doesn't > > cache > > the data file to ram (the box has 32G), so even for re-reads I get > > miserable numbers. With solaris, the re-reads happen at nearly wire > > spead. > > > > Any ideas what else I could tune? While 76MB/s is much better than > > the > > original 17MB I was seeing, it's still far from Solaris's ~220MB/s... > > > > Thanks a lot, > > Zoltan > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGFYgbMgqPR=cJQ4-ik8K=rrKfPEE%2BW9m-LZVBS0LnYJXAEtgg>