Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Jun 2013 15:13:53 +0200
From:      Zoltan Arnold NAGY <zoltan.arnold.nagy@gmail.com>
To:        freebsd-fs@freebsd.org
Subject:   ZFS-backed NFS export with vSphere
Message-ID:  <CAGFYgbPoO8fqkZxCAgi-p24su=%2BLC4KPswntsNaRK-aCktmWuA@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi list,

I'd love to have a ZFS-backed NFS export as my VM datastore, but as much as
I'd like to tune
it, the performance doesn't even get close to Solaris 11's.

I currently have the system set up as this:

  pool: tank
 state: ONLINE
  scan: none requested
config:

    NAME        STATE     READ WRITE CKSUM
    tank        ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        da0     ONLINE       0     0     0
        da1     ONLINE       0     0     0
      mirror-1  ONLINE       0     0     0
        da2     ONLINE       0     0     0
        da3     ONLINE       0     0     0
    logs
      ada0p4    ONLINE       0     0     0
    spares
      da4       AVAIL

ada0 is a samsung 840pro SSD, which I'm using for system+ZIL.
daX is 1TB, 7200rpm seagate disks.
(from this test's perspective, if I use a separate ZIL device or just a
partition, doesn't matter - I get roughly the same numbers).

The first thing I noticed is that the FSINFO reply from FreeBSD is
advertising untunable values (I did not find them documented either in the
manpage, or as a sysctl).

rtmax, rtpref, wtmax, wtpref: 64k (fbsd), 1M (solaris)
dtpref: 64k (fbsd), 8k (solaris)

After manually patching the nfs code (changing NFS_MAXBSIZE to 1M instead
of MAXBSIZE) to adversize the same read/write values (didn't touch dtpref),
my performance went up from 17MB/s to 76MB/s.

Is there a reason NFS_MAXBSIZE is not tunable and/or is it so slow?

Here's my iozone output (which is run on an ext4 partition created on a
linux VM which has a disk backed by the NFS exported from the FreeBSD box):

    Record Size 4096 KB
    File size set to 2097152 KB
    Command line used: iozone -b results.xls -r 4m -s 2g -t 6 -i 0 -i 1 -i 2
    Output is in Kbytes/sec
    Time Resolution = 0.000001 seconds.
    Processor cache size set to 1024 Kbytes.
    Processor cache line size set to 32 bytes.
    File stride size set to 17 * record size.
    Throughput test with 6 processes
    Each process writes a 2097152 Kbyte file in 4096 Kbyte records

    Children see throughput for  6 initial writers     =   76820.31 KB/sec
    Parent sees throughput for  6 initial writers     =   74899.44 KB/sec
    Min throughput per process             =   12298.62 KB/sec
    Max throughput per process             =   12972.72 KB/sec
    Avg throughput per process             =   12803.38 KB/sec
    Min xfer                     = 1990656.00 KB

    Children see throughput for  6 rewriters     =   76030.99 KB/sec
    Parent sees throughput for  6 rewriters     =   75062.91 KB/sec
    Min throughput per process             =   12620.45 KB/sec
    Max throughput per process             =   12762.80 KB/sec
    Avg throughput per process             =   12671.83 KB/sec
    Min xfer                     = 2076672.00 KB

    Children see throughput for  6 readers         =  114221.39 KB/sec
    Parent sees throughput for  6 readers         =  113942.71 KB/sec
    Min throughput per process             =   18920.14 KB/sec
    Max throughput per process             =   19183.80 KB/sec
    Avg throughput per process             =   19036.90 KB/sec
    Min xfer                     = 2068480.00 KB

    Children see throughput for 6 re-readers     =  117018.50 KB/sec
    Parent sees throughput for 6 re-readers     =  116917.01 KB/sec
    Min throughput per process             =   19436.28 KB/sec
    Max throughput per process             =   19590.40 KB/sec
    Avg throughput per process             =   19503.08 KB/sec
    Min xfer                     = 2080768.00 KB

    Children see throughput for 6 random readers     =  110072.68 KB/sec
    Parent sees throughput for 6 random readers     =  109698.99 KB/sec
    Min throughput per process             =   18260.33 KB/sec
    Max throughput per process             =   18442.55 KB/sec
    Avg throughput per process             =   18345.45 KB/sec
    Min xfer                     = 2076672.00 KB

    Children see throughput for 6 random writers     =   76389.71 KB/sec
    Parent sees throughput for 6 random writers     =   74816.45 KB/sec
    Min throughput per process             =   12592.09 KB/sec
    Max throughput per process             =   12843.75 KB/sec
    Avg throughput per process             =   12731.62 KB/sec
    Min xfer                     = 2056192.00 KB

The other interesting this is that you can notice the system doesn't cache
the data file to ram (the box has 32G), so even for re-reads I get
miserable numbers. With solaris, the re-reads happen at nearly wire spead.

Any ideas what else I could tune? While 76MB/s is much better than the
original 17MB I was seeing, it's still far from Solaris's ~220MB/s...

Thanks a lot,
Zoltan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGFYgbPoO8fqkZxCAgi-p24su=%2BLC4KPswntsNaRK-aCktmWuA>