Date: Tue, 1 May 2012 07:37:31 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl> Cc: freebsd-hackers@freebsd.org Subject: Re: NFS - slow Message-ID: <482299836.184445.1335872251190.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <alpine.BSF.2.00.1205010700240.5909@wojtek.tensor.gdynia.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
Wojciech Puchar wrote: > i tried nfsv4, tested under FreeBSD over localhost and it is roughly > the > same. am i doing something wrong? > Probably not. NFSv4 writes are done exactly the same as NFSv3. (It changes other stuff, like locking, adding support for ACLs, etc.) I do have a patch that allows the client to do more extension caching to local disk in the client (called Packrats), but that isn't ready for prime time yet. NFSv4.1 optionally supports pNFS, where reading and writing can be done to Data Servers (DS) separate from the NFS (called Metadata Server or MDS). I`m working on the client side of this, but it is also a work-in-progress and no work on a NFSv4.1 server for FreeBSD has been done yet, as far as I know. If you have increased MAXBSIZE in both the client and server machines and use the new (experimental in 8.x) client and server, they will use a larger rsize, wsize for NFSv3 as well as NFSv4. (Capturing packets and looking at them in wireshark will tell you what the actual rsize, wsize is. A patch to nfsstat to get the actual mount options in use is another of my `to do`items. If anyone else wants to work on this, I`d be happy to help them. > On Mon, 30 Apr 2012, Peter Jeremy wrote: > > > On 2012-Apr-27 22:05:42 +0200, Wojciech Puchar > > <wojtek@wojtek.tensor.gdynia.pl> wrote: > >> is there any way to speed up NFS server? > > ... > >> - write works terribly. it performs sync on every write IMHO, > > > > You don't mention which NFS server or NFS version you are using but > > for "traditional" NFS, this is by design. The NFS server is > > stateless > > and NFS server failures are transparent (other than time-wise) to > > the > > client. This means that once the server acknowledges a write, it > > guarantees the client will be able to later retrieve that data, even > > if the server crashes. This implies that the server needs to do a > > synchronous write to disk before it can return the acknowledgement > > back to the client. > > > > -- > > Peter Jeremy > > Btw, For NFSv3 and 4, the story is slightly different than the above. A client can do writes with a flag that is either FILESYNC or UNSTABLE. For FILESYNC, the server must do exactly what the above says. That is, the data and any required metadata changes, must be on stable storage before the server replies to the RPC. For UNSTABLE, the server can simply save the data in memory and reply OK to the RPC. For this case, the client needs to do a separate Commit RPC later and the server must store the data on stable storage at that time. (For this case, the client needs to keep the data written UNSTABLE in its cache and be prepared to re-write it, if the server reboots before the Commit RPC is done.) - When any app. does a fsync(2), the client needs to do a Commit RPC if it has been doing UNSTABLE writes. Most clients, including FreeBSD, do writes with UNSTABLE. However, one limitation on the FreeBSD client is that it currently only keeps track of one contiguous modified byte range in a buffer cache block. When an app. in the client does non-contiguous writes to the same buffer cache block, it must write the old modified byte range to the server with FILESYNC before it copies the newly written data into the buffer cache block. This happens frequently for builds during the loader phase. (jhb and I have looked at this. I have an experimental patch that makes the modified byte range a list, but it requires changes to struct buf. I think it is worth persuing. It is a client side patch, since that is where things can be improved, if clients avoid doing FILESYNC or frequent Commit RPCs.) rick > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to > "freebsd-hackers-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?482299836.184445.1335872251190.JavaMail.root>