Date: Sun, 12 May 2013 17:27:09 -0700 From: "Marc G. Fournier" <scrappy@hub.org> To: Rick Macklem <rmacklem@uoguelph.ca>, freebsd-fs@freebsd.org Subject: Re: NFS Performance issue against NetApp Message-ID: <5190335D.9090105@hub.org> In-Reply-To: <1966772823.291493.1368362883964.JavaMail.root@erie.cs.uoguelph.ca> References: <1966772823.291493.1368362883964.JavaMail.root@erie.cs.uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2013-05-12 5:48 AM, Rick Macklem wrote: > Marc G. Fournier wrote: >> >>> With >>> >>> vfs.nfs.noconsist=3 ... 385595ms >>> >>> nfsstat -z before startup, nfsstat -c after: >>> >>> Client Info: >>> Rpc Counts: >>> Getattr Setattr Lookup Readlink Read Write Create >>> Remove >>> 332594 5 17238 0 224426 231137 >>> 3743 1 >>> Rename Link Symlink Mkdir Rmdir Readdir >>> RdirPlus Access >>> 0 0 0 307 0 71 0 8447 >>> Mknod Fsstat Fsinfo PathConf Commit >>> 0 509 0 0 0 >>> Rpc Info: >>> TimedOut Invalid X Replies Retries Requests >>> 0 0 0 0 818479 >>> Cache Info: >>> Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW >>> Hits Misses >>> 608296 332596 526200 17245 -95425 224426 13178 >>> 231137 >>> BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs >>> Hits Misses >>> 0 0 1050 55 502 7 >>> 543340 8448 With patch applied: Client Info: Rpc Counts: Getattr Setattr Lookup Readlink Read Write Create Remove 236577 5 17311 0 233269 231136 3743 1 Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access 0 0 0 307 0 391 0 8488 Mknod Fsstat Fsinfo PathConf Commit 0 543 1 0 0 Rpc Info: TimedOut Invalid X Replies Retries Requests 0 0 0 0 731770 Cache Info: Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW Hits Misses 714778 236578 529160 17312 -104087 233068 13178 231136 BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs Hits Misses 0 0 788 375 542 0 546435 8488 RPC Info Requests appear to be down but # of read/writes/getattr haven't changed any ... Why does it take 34x as many reads on FreeBSD, where rsize on both Linux/FreeBSD are the same ... ? The amount of data to be read is the same ... shouldn't the # of reads be within the same ballpark, at least ... ? > Ok, so disabling the mtime based cache consistency doesn't make > much difference. Forget about that one. > > I've attached another patch (which you probably shouldn't use for > a production system either) to be tried instead of the last one. > (This one is basically "work in progress" by Alexander Kabaev for > better performance during file linking. I hope he doesn't mind > me posting it.) > > rick > >>> ============ >>> >>> vfs.nfs.noconsist=2 ... 392201ms >>> >>> Client Info: >>> Rpc Counts: >>> Getattr Setattr Lookup Readlink Read Write Create >>> Remove >>> 332557 5 17228 0 224421 231131 >>> 3743 1 >>> Rename Link Symlink Mkdir Rmdir Readdir >>> RdirPlus Access >>> 0 0 0 307 0 72 0 8430 >>> Mknod Fsstat Fsinfo PathConf Commit >>> 0 502 0 0 0 >>> Rpc Info: >>> TimedOut Invalid X Replies Retries Requests >>> 0 0 0 0 818395 >>> Cache Info: >>> Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW >>> Hits Misses >>> 607834 332557 525801 17231 -95401 224421 13178 >>> 231131 >>> BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs >>> Hits Misses >>> 0 0 1028 56 502 0 >>> 542925 8431 >>> >>> >>> ============ >>> vfs.nfs.noconsist=0 ... 391622ms >>> >>> >>> Client Info: >>> Rpc Counts: >>> Getattr Setattr Lookup Readlink Read Write Create >>> Remove >>> 236122 5 17221 0 230575 230823 >>> 3743 1 >>> Rename Link Symlink Mkdir Rmdir Readdir >>> RdirPlus Access >>> 0 0 0 307 0 71 0 8425 >>> Mknod Fsstat Fsinfo PathConf Commit >>> 0 516 0 0 0 >>> Rpc Info: >>> TimedOut Invalid X Replies Retries Requests >>> 0 0 0 0 727799 >>> Cache Info: >>> Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW >>> Hits Misses >>> 711860 236124 526549 17225 -101525 230490 13178 >>> 230823 >>> BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs >>> Hits Misses >>> 0 0 1057 55 516 0 >>> 543709 8425 >>> >>> >>> I checked a second time with nonconsist=0, and the nfsstat -c values >>> seem to come out pretty much the same ... >>> >>> I'm going to head down to the office and try again with Solaris (I'd >>> have to re-install, since I used that system for the Solaris), and >>> see >>> what nfsstat -c results I get out of that ... will post a followup >>> on >>> this when completed ... >>> >>> >>> >>> On 2013-05-10 5:32 PM, Rick Macklem wrote: >>>> Marc G. Fournier wrote: >>>>> FYI … I just installed Solaris 11 onto the same hardware and ran >>>>> the >>>>> same test … so far, I'm seeing: >>>>> >>>>> Linux @ ~30s >>>>> Solaris @ ~44s >>>>> >>>>> OpenBSD @ ~200s >>>>> FreeBSD @ ~240s >>>>> >>>>> I've even tried FreeBSD 8.3 just to see if maybe its as 'newish' >>>>> issue >>>>> … same as 9.x … I could see Linux 'cutting corners', but >>>>> Oracle/Solaris too … ? >>>>> >>>> The three client implementations (BSD, Linux, Solaris) were >>>> developed >>>> independently and, as such, will all implement somewaht different >>>> caching algorithms (the RFCs specify what goes on the wire, but say >>>> little w.r.t. client side caching). >>>> >>>> I have a attached a patch that might be useful for determining if >>>> the client side buffer cache consistency algorithm in FreeBSD is >>>> causing the slow startup of jboss. Do not run this patch on a >>>> production system, since it pretty well disables all buffer cache >>>> coherency (ie. if another client modifies a file, the patched >>>> client >>>> won't notice and will continue to cache stale file data). >>>> >>>> If the patch does speed up startup of jboss significantly, you can >>>> use the sysctl: >>>> vfs.nfs.noconsist >>>> to check for which coherency check is involved by decreasing the >>>> value for the sysctl by 1 and then trying a startup again. (When >>>> vfs.nfs.noconsist=0, normal cache coherency will be applied.) >>>> >>>> I have no idea if buffer cache coherency is a factor, but trying >>>> the attached patch might determine if it is. >>>> >>>> Note that you have never posted updated "nfsstat -c" values. >>>> (Remember that what you posted indicated 88 RPCs, which seemed >>>> bogus.) Finding out if FreeBSD does a lot more of certain RPCs >>>> that Linux/Solaris might help isolate what is going on. >>>> >>>> rick >>>> >>>>> On 2013-05-03, at 04:50 , Mark Felder <feld@feld.me> wrote: >>>>> >>>>>> On Thu, 02 May 2013 18:43:17 -0500, Marc G. Fournier >>>>>> <scrappy@hub.org> wrote: >>>>>> >>>>>>> Hadn't thought to do so with Linux, but … >>>>>>> Linux ……. 20732ms, 20117ms, 20935ms, 20130ms, 20560ms >>>>>>> FreeBSD .. 28996ms, 24794ms, 24702ms, 23311ms, 24153ms >>>>>> Please make sure both platforms are using similar atime settings. >>>>>> I >>>>>> think most distros use ext4 with diratime by default. I'd just do >>>>>> noatime on both platforms to be safe. >>>>>> _______________________________________________ >>>>>> freebsd-fs@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>>>> To unsubscribe, send any mail to >>>>>> "freebsd-fs-unsubscribe@freebsd.org" >>>>> _______________________________________________ >>>>> freebsd-fs@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>>> To unsubscribe, send any mail to >>>>> "freebsd-fs-unsubscribe@freebsd.org" >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to >>> "freebsd-fs-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5190335D.9090105>