Date: Tue, 4 Apr 2000 14:16:41 -0700 From: Alfred Perlstein <bright@wintelcom.net> To: Andrew Gallatin <gallatin@cs.duke.edu> Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: reducing the number of NFSv3 commit ops Message-ID: <20000404141641.P20770@fw.wintelcom.net> In-Reply-To: <14570.10864.359054.10598@grasshopper.cs.duke.edu>; from gallatin@cs.duke.edu on Tue, Apr 04, 2000 at 04:35:57PM -0400 References: <14570.10864.359054.10598@grasshopper.cs.duke.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
* Andrew Gallatin <gallatin@cs.duke.edu> [000404 14:03] wrote: > > Currently FreeBSD issues a very large number of NFSv3 commit rpcs when > writing a sequential file. They average out to about one every 64k or > so. Solaris, on the other hand, issues only a handful. > > At least when running against a Solaris NFS server, these > frequent commits really kill our write bandwidth. > > The commits are initiated out of the bufdaemon: > > nfs_commit(e06866c0,360000,0,10000,c8aa5e00) at nfs_commit+0x52a > nfs_doio(d3088158,c8aa5e00,0,d3088158,40084040) at nfs_doio+0x371 > nfs_strategy(ddef1ec0) at nfs_strategy+0x68 > nfs_writebp(d3088158,1,ddee5920,ddef1ef8,c0180e42) at nfs_writebp+0xdc > nfs_bwrite(ddef1eec,c02a15c0,e06866c0,d3088158,ddef1f28) at nfs_bwrite+0x16 > bawrite(d3088158,d30faff0,0,40084040,d30fbae8) at bawrite+0x32 > cluster_wbuild(e06866c0,2000,1b8,10,d30fc328) at cluster_wbuild+0x493 > vfs_bio_awrite(d30fc328,3f,c0181f8c,c016aef5,0) at vfs_bio_awrite+0x1a4 > flushbufqueues(0,8000,c024be00,0,b0206) at flushbufqueues+0x116 > buf_daemon(0) at buf_daemon+0x8f > fork_trampoline() at fork_trampoline+0x8 > > The "problem" is that flushbufqueues calls vfs_bio_awrite on the buf's > that need commiting. We then go through the overhead of clustering up > 64k worth of data & pass it down. It eventually ends up in nfs_doio() > which finally realizes that the bufs just need to be committed & calls > nfs_commit() on them. This is repeated for every 64k of data. > > I have an idea on how to reduce these commits & a proof of concept > implementation of it. My idea is to have nfs_doio() call a function > (which I've called nfs_megacommit()) to consolodate all the > B_NEEDCOMMIT bufs from a particular file into one large commit. This > nfs_megacommit() function is basically a cut-n-paste of the top half > of nfs_flush(). > > I just tried it this morning & it appears to work. Over a 1Gb/s > (Alteon, Jumbo frames) link, my write bandwidth increases from > 5-8MB/sec to 17-18MB/sec when talking to a Solaris (2.7, i86) NFS > server & writing a 375MB file. The server's nfsstat looks like this. > > Before: > > Version 3: (54262 calls) > null getattr setattr lookup access readlink > 0 0% 0 0% 1 0% 1 0% 3 0% 0 0% > read write create mkdir symlink mknod > 0 0% 48325 89% 0 0% 0 0% 0 0% 0 0% > remove rmdir rename link readdir readdirplus > 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% > fsstat fsinfo pathconf commit > 0 0% 0 0% 0 0% 5932 10% > > > After: > > Version 3: (48078 calls) > null getattr setattr lookup access readlink > 0 0% 0 0% 0 0% 1 0% 1 0% 0 0% > read write create mkdir symlink mknod > 0 0% 48027 99% 1 0% 0 0% 0 0% 0 0% > remove rmdir rename link readdir readdirplus > 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% > fsstat fsinfo pathconf commit > 0 0% 0 0% 0 0% 48 0% > > > Can anybody tell me if doing something like this is fundamentally > broken? Is it worth pursuing? http://www.freebsd.org/~alfred/nfs_supercommit_broken.diff only grab as many adjacent blocks as possible, you don't want to scan the entire file's buffer list for each commit, you also don't want to interfere with other client's caching forcing sever commits on thier behalf. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk." To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000404141641.P20770>