Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Oct 2014 22:50:20 -0400
From:      Garrett Wollman <wollman@bimajority.org>
To:        freebsd-stable@freebsd.org, freebsd-fs@freebsd.org
Subject:   Some 9.3 NFS testing
Message-ID:  <21576.27884.76574.977691@hergotha.csail.mit.edu>

next in thread | raw e-mail | index | archive | help
Just thought I'd share this...

I've been doing some acceptance testing on 9.3 prior to upgrading my
production NFS servers.  My most recent test is running bonnie++ on
192 Ubuntu VMs in parallel, to independent directories in the same
server filesystem.  It hasn't fallen over yet (will probably take
another day or so to complete), and peaked at about 220k ops/s (but
this was NFSv4 so there's no FHA and it takes at least two ops for
every v3 RPC[1]).  bonnie++ is running with -D (O_DIRECT), but I'm
actually just using it as a load generator -- I don't care about the
output.

I have this system configured for a maximum of 64 nfsd threads, and
the test load has had it pegged for the past eight hours.  Right now
all of the load generators are doing the "small file" part of
bonnie++, so there's not a lot of activity but there are a lot of
synchronous operations; it's been doing 60k ops/s for the past five
hours.  Load average maxed out at about 24 early on in the test, and
has settled around 16-20 for this part of the test.

Here's what nfsstat -se has to say (note: not reset for this round of
testing):

Server Info:
  Getattr   Setattr    Lookup  Readlink      Read     Write    Create    Remove
1566655064 230074779 162549702         0 471311053 1466525587 149235773 115496945
   Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    Access
      125         0         0       245       116   2032193  27485368 223929240
    Mknod    Fsstat    Fsinfo  PathConf    Commit   LookupP   SetClId SetClIdCf
        0        53       268       131  15999631         0       386       386
     Open  OpenAttr OpenDwnGr  OpenCfrm DelePurge   DeleRet     GetFH      Lock
 80924092         0         0       194         0         0  81110394         0
    LockT     LockU     Close    Verify   NVerify     PutFH  PutPubFH PutRootFH
        0         0  80578106         0         0 1203868156         0       193
    Renew RestoreFH    SaveFH   Secinfo RelLckOwn  V4Create
     1271         0        14       384         0       570
Server:
Retfailed    Faults   Clients
        0         0       191
OpenOwner     Opens LockOwner     Locks    Delegs 
      192       154         0         0         0 
Server Cache Stats:
   Inprog      Idem  Non-idem    Misses CacheSize   TCPPeak
        0         0         0 -167156883      1651    115531

I'd love to mix in some FreeBSD-generated loads but as discussed a
week or so ago, our NFS client can't handle reading directories from
which files are being deleted.

FWIW, I just ran a quick "pmcstat -T" and noted the following:

PMC: [unhalted-core-cycles] Samples: 775371 (100.0%) , 3264 unresolved
Key: q => exiting...
%SAMP IMAGE      FUNCTION             CALLERS
 24.0 kernel     _mtx_lock_sleep      _vm_map_lock:22.4 ...
  4.7 kernel     Xinvlrng
  4.7 kernel     _mtx_lock_spin       pmclog_reserve
  4.2 kernel     _sx_xlock_hard       _sx_xlock
  3.8 pmcstat    _init
  2.5 kernel     bcopy                vdev_queue_io_done
  1.7 kernel     _sx_xlock
  1.6 zfs.ko     lzjb_compress        zio_compress_data
  1.4 zfs.ko     lzjb_decompress      zio_decompress
  1.2 kernel     _sx_xunlock
  1.2 kernel     ipfw_chk             ipfw_check_hook
  1.1 libc.so.7  bsearch
  1.0 zfs.ko     fletcher_4_native    zio_checksum_compute
  1.0 kernel     vm_page_splay        vm_page_find_least
  1.0 kernel     cpu_idle_mwait       sched_idletd
  1.0 kernel     free
  0.9 kernel     bzero
  0.9 kernel     cpu_search_lowest    cpu_search_lowest
  0.8 kernel     vm_map_entry_splay   vm_map_lookup_entry
  0.8 kernel     cpu_search_highest   cpu_search_highest

I doubt that this is news to anybody.  Once I get the production
servers upgraded to 9.3, I'll be ready to start testing 10.1 on this
same setup.

-GAWollman

[1] I did previous testing, with smaller numbers of clients, using v3
as that is what we currently require our clients to use.  I switched
to v4 to try out the worst case -- after finding an OpenStack bug that
was preventing me from starting more than 16 load generators at a
time.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21576.27884.76574.977691>