From owner-freebsd-fs@FreeBSD.ORG Mon May 13 00:59:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 369CC2DC for ; Mon, 13 May 2013 00:59:01 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:32]) by mx1.freebsd.org (Postfix) with ESMTP id 0517796D for ; Mon, 13 May 2013 00:59:00 +0000 (UTC) Received: from omta17.emeryville.ca.mail.comcast.net ([76.96.30.73]) by qmta03.emeryville.ca.mail.comcast.net with comcast id bC941l0031afHeLA3Cz0rj; Mon, 13 May 2013 00:59:00 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta17.emeryville.ca.mail.comcast.net with comcast id bCyz1l0051t3BNj8dCyz7r; Mon, 13 May 2013 00:58:59 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id EE6AA73A33; Sun, 12 May 2013 17:58:58 -0700 (PDT) Date: Sun, 12 May 2013 17:58:58 -0700 From: Jeremy Chadwick To: "Marc G. Fournier" Subject: Re: NFS Performance issue against NetApp Message-ID: <20130513005858.GA73875@icarus.home.lan> References: <1966772823.291493.1368362883964.JavaMail.root@erie.cs.uoguelph.ca> <5190335D.9090105@hub.org> MIME-Version: 1.0 Content-Type: text/plain; charset=unknown-8bit Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5190335D.9090105@hub.org> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1368406740; bh=ejuZH+sbUXMh5H7LzzdE+Ubo8u292EsjkIFljP4jqmE=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=XXNNj/A20zBmHLKv02TymyWvbX3NI8QXyO71CkZscScf9S95WMZsF3TZo06BeNdjf +6v93PPBrqT5z49jEPX213EwVb+PgZlgzaluIsQUv2HXYoZaagpogWo36oVaMc7JJ7 sUfKNDkdLH2GqOPPuNvmSSnlJxIJDQUks7Q7uyPP2NS0/PEpDnB1+Dd2Nmd+Pnnyx+ cN/6QrjsdYbiuarjp7hfrApLzMPcMPo2xhBsfyvmNWnAuUrNI6509gbXoEz/HkUqAq d83fCgTeZYPhtGf4jH5qdTC6haYTva0BWCLf/LN6pN/ikSrqY6RaaUE/KyVQtusAt+ xu2zRrZfpitog== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 May 2013 00:59:01 -0000 On Sun, May 12, 2013 at 05:27:09PM -0700, Marc G. Fournier wrote: > On 2013-05-12 5:48 AM, Rick Macklem wrote: > >Marc G. Fournier wrote: > >> > >>>With > >>> > >>>vfs.nfs.noconsist=3 ... 385595ms > >>> > >>>nfsstat -z before startup, nfsstat -c after: > >>> > >>>Client Info: > >>>Rpc Counts: > >>> Getattr Setattr Lookup Readlink Read Write Create > >>>Remove > >>> 332594 5 17238 0 224426 231137 > >>>3743 1 > >>> Rename Link Symlink Mkdir Rmdir Readdir > >>>RdirPlus Access > >>> 0 0 0 307 0 71 0 8447 > >>> Mknod Fsstat Fsinfo PathConf Commit > >>> 0 509 0 0 0 > >>>Rpc Info: > >>> TimedOut Invalid X Replies Retries Requests > >>> 0 0 0 0 818479 > >>>Cache Info: > >>>Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW > >>>Hits Misses > >>> 608296 332596 526200 17245 -95425 224426 13178 > >>>231137 > >>>BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs > >>>Hits Misses > >>> 0 0 1050 55 502 7 > >>>543340 8448 > > With patch applied: > > Client Info: > Rpc Counts: > Getattr Setattr Lookup Readlink Read Write Create > Remove > 236577 5 17311 0 233269 231136 3743 1 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus > Access > 0 0 0 307 0 391 0 8488 > Mknod Fsstat Fsinfo PathConf Commit > 0 543 1 0 0 > Rpc Info: > TimedOut Invalid X Replies Retries Requests > 0 0 0 0 731770 > Cache Info: > Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW > Hits Misses > 714778 236578 529160 17312 -104087 233068 13178 231136 > BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs > Hits Misses > 0 0 788 375 542 0 546435 > 8488 > > RPC Info Requests appear to be down but # of read/writes/getattr > haven't changed any ... > > Why does it take 34x as many reads on FreeBSD, where rsize on both > Linux/FreeBSD are the same ... ? The amount of data to be read is > the same ... shouldn't the # of reads be within the same ballpark, > at least ... ? > > > >Ok, so disabling the mtime based cache consistency doesn't make > >much difference. Forget about that one. > > > >I've attached another patch (which you probably shouldn't use for > >a production system either) to be tried instead of the last one. > >(This one is basically "work in progress" by Alexander Kabaev for > > better performance during file linking. I hope he doesn't mind > > me posting it.) > > > >rick > > > >>>============ > >>> > >>>vfs.nfs.noconsist=2 ... 392201ms > >>> > >>>Client Info: > >>>Rpc Counts: > >>> Getattr Setattr Lookup Readlink Read Write Create > >>>Remove > >>> 332557 5 17228 0 224421 231131 > >>>3743 1 > >>> Rename Link Symlink Mkdir Rmdir Readdir > >>>RdirPlus Access > >>> 0 0 0 307 0 72 0 8430 > >>> Mknod Fsstat Fsinfo PathConf Commit > >>> 0 502 0 0 0 > >>>Rpc Info: > >>> TimedOut Invalid X Replies Retries Requests > >>> 0 0 0 0 818395 > >>>Cache Info: > >>>Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW > >>>Hits Misses > >>> 607834 332557 525801 17231 -95401 224421 13178 > >>>231131 > >>>BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs > >>>Hits Misses > >>> 0 0 1028 56 502 0 > >>>542925 8431 > >>> > >>> > >>>============ > >>>vfs.nfs.noconsist=0 ... 391622ms > >>> > >>> > >>>Client Info: > >>>Rpc Counts: > >>> Getattr Setattr Lookup Readlink Read Write Create > >>>Remove > >>> 236122 5 17221 0 230575 230823 > >>>3743 1 > >>> Rename Link Symlink Mkdir Rmdir Readdir > >>>RdirPlus Access > >>> 0 0 0 307 0 71 0 8425 > >>> Mknod Fsstat Fsinfo PathConf Commit > >>> 0 516 0 0 0 > >>>Rpc Info: > >>> TimedOut Invalid X Replies Retries Requests > >>> 0 0 0 0 727799 > >>>Cache Info: > >>>Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW > >>>Hits Misses > >>> 711860 236124 526549 17225 -101525 230490 13178 > >>>230823 > >>>BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs > >>>Hits Misses > >>> 0 0 1057 55 516 0 > >>>543709 8425 > >>> > >>> > >>>I checked a second time with nonconsist=0, and the nfsstat -c values > >>>seem to come out pretty much the same ... > >>> > >>>I'm going to head down to the office and try again with Solaris (I'd > >>>have to re-install, since I used that system for the Solaris), and > >>>see > >>>what nfsstat -c results I get out of that ... will post a followup > >>>on > >>>this when completed ... > >>> > >>> > >>> > >>>On 2013-05-10 5:32 PM, Rick Macklem wrote: > >>>>Marc G. Fournier wrote: > >>>>>FYI … I just installed Solaris 11 onto the same hardware and ran > >>>>>the > >>>>>same test … so far, I'm seeing: > >>>>> > >>>>>Linux @ ~30s > >>>>>Solaris @ ~44s > >>>>> > >>>>>OpenBSD @ ~200s > >>>>>FreeBSD @ ~240s > >>>>> > >>>>>I've even tried FreeBSD 8.3 just to see if maybe its as 'newish' > >>>>>issue > >>>>>… same as 9.x … I could see Linux 'cutting corners', but > >>>>>Oracle/Solaris too … ? > >>>>> > >>>>The three client implementations (BSD, Linux, Solaris) were > >>>>developed > >>>>independently and, as such, will all implement somewaht different > >>>>caching algorithms (the RFCs specify what goes on the wire, but say > >>>>little w.r.t. client side caching). > >>>> > >>>>I have a attached a patch that might be useful for determining if > >>>>the client side buffer cache consistency algorithm in FreeBSD is > >>>>causing the slow startup of jboss. Do not run this patch on a > >>>>production system, since it pretty well disables all buffer cache > >>>>coherency (ie. if another client modifies a file, the patched > >>>>client > >>>>won't notice and will continue to cache stale file data). > >>>> > >>>>If the patch does speed up startup of jboss significantly, you can > >>>>use the sysctl: > >>>> vfs.nfs.noconsist > >>>>to check for which coherency check is involved by decreasing the > >>>>value for the sysctl by 1 and then trying a startup again. (When > >>>>vfs.nfs.noconsist=0, normal cache coherency will be applied.) > >>>> > >>>>I have no idea if buffer cache coherency is a factor, but trying > >>>>the attached patch might determine if it is. > >>>> > >>>>Note that you have never posted updated "nfsstat -c" values. > >>>>(Remember that what you posted indicated 88 RPCs, which seemed > >>>> bogus.) Finding out if FreeBSD does a lot more of certain RPCs > >>>>that Linux/Solaris might help isolate what is going on. > >>>> > >>>>rick > >>>> > >>>>>On 2013-05-03, at 04:50 , Mark Felder wrote: > >>>>> > >>>>>>On Thu, 02 May 2013 18:43:17 -0500, Marc G. Fournier > >>>>>> wrote: > >>>>>> > >>>>>>>Hadn't thought to do so with Linux, but … > >>>>>>>Linux ……. 20732ms, 20117ms, 20935ms, 20130ms, 20560ms > >>>>>>>FreeBSD .. 28996ms, 24794ms, 24702ms, 23311ms, 24153ms > >>>>>>Please make sure both platforms are using similar atime settings. > >>>>>>I > >>>>>>think most distros use ext4 with diratime by default. I'd just do > >>>>>>noatime on both platforms to be safe. Quoting/top-reply/Email-clients-suck-balls madness has made this thread basically impossible to follow visually at this point, so I'm replying at the bottom quoting what was said above: > Why does it take 34x as many reads on FreeBSD, where rsize on both > Linux/FreeBSD are the same ... ? The amount of data to be read is > the same ... shouldn't the # of reads be within the same ballpark, > at least ... ? Can you provide actual proof of this, re: that at the syscall level, JBoss is issuing the same nbytes count to read(2) on BSD as it is on Linux and Solaris? You can accomplish this with strace on Linux and ktrace (not truss) on FreeBSD. For all we know, JBoss could have some "OS optimisation" setting within it that says for BSDs use 512 bytes, while for Linux and/or Solaris use 16384 bytes. Nothing mandates that a framework/language/whatever use the same size across all OSes. In fact, there was already a statement made earlier in this thread about something doing file I/O with a byte size of ***1 byte*** for reads and writes: http://lists.freebsd.org/pipermail/freebsd-fs/2013-May/017166.html Welcome to why "abstracted" languages with a crap-load of "middleman" layers are very, very hard to debug/troubleshoot, particularly with regards to performance. KISS principle mandates the less crap between the application and the syscall, the easier it is to figure out. My advice at this point: take JBoss out of the picture. Use some other file-based I/O test, like benchmarks/bonnie++ or benchmarks/fio, against a file that's backed by an NFS mount. Do not ask me if these are good utilities to test such behaviour, or what arguments to use. If it turns out JBoss does not behave/play nicely on the BSDs, I won't be surprised. Probably off-topic but worth pointing out: I do not know about Solaris, but Linux has multiple layers of caching, and is well-known for doing things like caching (and aggregating!) reads/writes to **block** devices (this is why on Linux you have to make sure to avoid caching your application use O_DIRECT with open(2) or other mechanisms -- the BSDs do not do this, block devices are always non-cached). -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |