Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 May 2013 17:58:58 -0700
From:      Jeremy Chadwick <jdc@koitsu.org>
To:        "Marc G. Fournier" <scrappy@hub.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: NFS Performance issue against NetApp
Message-ID:  <20130513005858.GA73875@icarus.home.lan>
In-Reply-To: <5190335D.9090105@hub.org>
References:  <1966772823.291493.1368362883964.JavaMail.root@erie.cs.uoguelph.ca> <5190335D.9090105@hub.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, May 12, 2013 at 05:27:09PM -0700, Marc G. Fournier wrote:
> On 2013-05-12 5:48 AM, Rick Macklem wrote:
> >Marc G. Fournier wrote:
> >>
> >>>With
> >>>
> >>>vfs.nfs.noconsist=3 ... 385595ms
> >>>
> >>>nfsstat -z before startup, nfsstat -c after:
> >>>
> >>>Client Info:
> >>>Rpc Counts:
> >>>   Getattr Setattr Lookup Readlink Read Write Create
> >>>Remove
> >>>    332594 5 17238 0 224426 231137
> >>>3743 1
> >>>    Rename Link Symlink Mkdir Rmdir Readdir
> >>>RdirPlus Access
> >>>         0 0 0 307 0 71 0 8447
> >>>     Mknod Fsstat Fsinfo PathConf Commit
> >>>         0 509 0 0 0
> >>>Rpc Info:
> >>>  TimedOut Invalid X Replies Retries Requests
> >>>         0 0 0 0 818479
> >>>Cache Info:
> >>>Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW
> >>>Hits Misses
> >>>    608296 332596 526200 17245 -95425 224426 13178
> >>>231137
> >>>BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs
> >>>Hits Misses
> >>>         0 0 1050 55 502 7
> >>>543340 8448
> 
> With patch applied:
> 
> Client Info:
> Rpc Counts:
>   Getattr   Setattr    Lookup  Readlink      Read     Write Create
> Remove
>    236577         5     17311         0    233269    231136 3743         1
>    Rename      Link   Symlink     Mkdir     Rmdir   Readdir RdirPlus
> Access
>         0         0         0       307         0       391 0      8488
>     Mknod    Fsstat    Fsinfo  PathConf    Commit
>         0       543         1         0         0
> Rpc Info:
>  TimedOut   Invalid X Replies   Retries  Requests
>         0         0         0         0    731770
> Cache Info:
> Attr Hits    Misses Lkup Hits    Misses BioR Hits    Misses BioW
> Hits    Misses
>    714778    236578    529160     17312   -104087    233068 13178    231136
> BioRLHits    Misses BioD Hits    Misses DirE Hits    Misses Accs
> Hits    Misses
>         0         0       788       375       542         0 546435
> 8488
> 
> RPC Info Requests appear to be down but # of read/writes/getattr
> haven't changed any ...
> 
> Why does it take 34x as many reads on FreeBSD, where rsize on both
> Linux/FreeBSD are the same ... ?   The amount of data to be read  is
> the same ... shouldn't the # of reads be within the same ballpark,
> at least ... ?
> 
> 
> >Ok, so disabling the mtime based cache consistency doesn't make
> >much difference. Forget about that one.
> >
> >I've attached another patch (which you probably shouldn't use for
> >a production system either) to be tried instead of the last one.
> >(This one is basically "work in progress" by Alexander Kabaev for
> >  better performance during file linking. I hope he doesn't mind
> >  me posting it.)
> >
> >rick
> >
> >>>============
> >>>
> >>>vfs.nfs.noconsist=2 ... 392201ms
> >>>
> >>>Client Info:
> >>>Rpc Counts:
> >>>   Getattr Setattr Lookup Readlink Read Write Create
> >>>Remove
> >>>    332557 5 17228 0 224421 231131
> >>>3743 1
> >>>    Rename Link Symlink Mkdir Rmdir Readdir
> >>>RdirPlus Access
> >>>         0 0 0 307 0 72 0 8430
> >>>     Mknod Fsstat Fsinfo PathConf Commit
> >>>         0 502 0 0 0
> >>>Rpc Info:
> >>>  TimedOut Invalid X Replies Retries Requests
> >>>         0 0 0 0 818395
> >>>Cache Info:
> >>>Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW
> >>>Hits Misses
> >>>    607834 332557 525801 17231 -95401 224421 13178
> >>>231131
> >>>BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs
> >>>Hits Misses
> >>>         0 0 1028 56 502 0
> >>>542925 8431
> >>>
> >>>
> >>>============
> >>>vfs.nfs.noconsist=0 ... 391622ms
> >>>
> >>>
> >>>Client Info:
> >>>Rpc Counts:
> >>>   Getattr Setattr Lookup Readlink Read Write Create
> >>>Remove
> >>>    236122 5 17221 0 230575 230823
> >>>3743 1
> >>>    Rename Link Symlink Mkdir Rmdir Readdir
> >>>RdirPlus Access
> >>>         0 0 0 307 0 71 0 8425
> >>>     Mknod Fsstat Fsinfo PathConf Commit
> >>>         0 516 0 0 0
> >>>Rpc Info:
> >>>  TimedOut Invalid X Replies Retries Requests
> >>>         0 0 0 0 727799
> >>>Cache Info:
> >>>Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW
> >>>Hits Misses
> >>>    711860 236124 526549 17225 -101525 230490 13178
> >>>230823
> >>>BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs
> >>>Hits Misses
> >>>         0 0 1057 55 516 0
> >>>543709 8425
> >>>
> >>>
> >>>I checked a second time with nonconsist=0, and the nfsstat -c values
> >>>seem to come out pretty much the same ...
> >>>
> >>>I'm going to head down to the office and try again with Solaris (I'd
> >>>have to re-install, since I used that system for the Solaris), and
> >>>see
> >>>what nfsstat -c results I get out of that ... will post a followup
> >>>on
> >>>this when completed ...
> >>>
> >>>
> >>>
> >>>On 2013-05-10 5:32 PM, Rick Macklem wrote:
> >>>>Marc G. Fournier wrote:
> >>>>>FYI … I just installed Solaris 11 onto the same hardware and ran
> >>>>>the
> >>>>>same test … so far, I'm seeing:
> >>>>>
> >>>>>Linux @ ~30s
> >>>>>Solaris @ ~44s
> >>>>>
> >>>>>OpenBSD @ ~200s
> >>>>>FreeBSD @ ~240s
> >>>>>
> >>>>>I've even tried FreeBSD 8.3 just to see if maybe its as 'newish'
> >>>>>issue
> >>>>>… same as 9.x … I could see Linux 'cutting corners', but
> >>>>>Oracle/Solaris too … ?
> >>>>>
> >>>>The three client implementations (BSD, Linux, Solaris) were
> >>>>developed
> >>>>independently and, as such, will all implement somewaht different
> >>>>caching algorithms (the RFCs specify what goes on the wire, but say
> >>>>little w.r.t. client side caching).
> >>>>
> >>>>I have a attached a patch that might be useful for determining if
> >>>>the client side buffer cache consistency algorithm in FreeBSD is
> >>>>causing the slow startup of jboss. Do not run this patch on a
> >>>>production system, since it pretty well disables all buffer cache
> >>>>coherency (ie. if another client modifies a file, the patched
> >>>>client
> >>>>won't notice and will continue to cache stale file data).
> >>>>
> >>>>If the patch does speed up startup of jboss significantly, you can
> >>>>use the sysctl:
> >>>>   vfs.nfs.noconsist
> >>>>to check for which coherency check is involved by decreasing the
> >>>>value for the sysctl by 1 and then trying a startup again. (When
> >>>>vfs.nfs.noconsist=0, normal cache coherency will be applied.)
> >>>>
> >>>>I have no idea if buffer cache coherency is a factor, but trying
> >>>>the attached patch might determine if it is.
> >>>>
> >>>>Note that you have never posted updated "nfsstat -c" values.
> >>>>(Remember that what you posted indicated 88 RPCs, which seemed
> >>>>   bogus.) Finding out if FreeBSD does a lot more of certain RPCs
> >>>>that Linux/Solaris might help isolate what is going on.
> >>>>
> >>>>rick
> >>>>
> >>>>>On 2013-05-03, at 04:50 , Mark Felder <feld@feld.me> wrote:
> >>>>>
> >>>>>>On Thu, 02 May 2013 18:43:17 -0500, Marc G. Fournier
> >>>>>><scrappy@hub.org> wrote:
> >>>>>>
> >>>>>>>Hadn't thought to do so with Linux, but …
> >>>>>>>Linux ……. 20732ms, 20117ms, 20935ms, 20130ms, 20560ms
> >>>>>>>FreeBSD .. 28996ms, 24794ms, 24702ms, 23311ms, 24153ms
> >>>>>>Please make sure both platforms are using similar atime settings.
> >>>>>>I
> >>>>>>think most distros use ext4 with diratime by default. I'd just do
> >>>>>>noatime on both platforms to be safe.

Quoting/top-reply/Email-clients-suck-balls madness has made this thread
basically impossible to follow visually at this point, so I'm replying
at the bottom quoting what was said above:

> Why does it take 34x as many reads on FreeBSD, where rsize on both
> Linux/FreeBSD are the same ... ?   The amount of data to be read  is
> the same ... shouldn't the # of reads be within the same ballpark,
> at least ... ?

Can you provide actual proof of this, re: that at the syscall level,
JBoss is issuing the same nbytes count to read(2) on BSD as it is on
Linux and Solaris?  You can accomplish this with strace on Linux and
ktrace (not truss) on FreeBSD.

For all we know, JBoss could have some "OS optimisation" setting within
it that says for BSDs use 512 bytes, while for Linux and/or Solaris use
16384 bytes.  Nothing mandates that a framework/language/whatever use
the same size across all OSes.  In fact, there was already a statement
made earlier in this thread about something doing file I/O with a byte
size of ***1 byte*** for reads and writes:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-May/017166.html

Welcome to why "abstracted" languages with a crap-load of "middleman"
layers are very, very hard to debug/troubleshoot, particularly with
regards to performance.  KISS principle mandates the less crap between
the application and the syscall, the easier it is to figure out.

My advice at this point: take JBoss out of the picture.  Use some other
file-based I/O test, like benchmarks/bonnie++ or benchmarks/fio, against
a file that's backed by an NFS mount.  Do not ask me if these are good
utilities to test such behaviour, or what arguments to use.  If it turns
out JBoss does not behave/play nicely on the BSDs, I won't be surprised.

Probably off-topic but worth pointing out: I do not know about Solaris,
but Linux has multiple layers of caching, and is well-known for doing
things like caching (and aggregating!) reads/writes to **block** devices
(this is why on Linux you have to make sure to avoid caching your
application use O_DIRECT with open(2) or other mechanisms -- the BSDs do
not do this, block devices are always non-cached).

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130513005858.GA73875>