Date: Fri, 6 Apr 2018 00:44:34 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Bruce Evans <brde@optusnet.com.au>, Kaya Saman <kayasaman@gmail.com> Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org> Subject: Re: Linux NFS client and FreeBSD server strangeness Message-ID: <YQBPR0101MB1042F229495A61D98F81A5DDDDBA0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <20180405134730.V1123@besplex.bde.org> References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> <bc78744a-c668-cbbd-f6fd-0064e7f0dc46@sentex.net> <C3024CF7-3928-4DBF-9855-82451BDACE01@inparadise.se> <4da08f8b-2c28-cf18-77d2-6b498004d435@gmail.com> <2937ffcc-6b47-91af-8745-2117006660db@sentex.net> <CAPj0R5LvJv59ZD9HyUBR25g98kffY=994OQgontwH602Tb9xQQ@mail.gmail.com>, <20180405134730.V1123@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Bruce Evans wrote >On Wed, 4 Apr 2018, Kaya Saman wrote: >> If I recall correctly the "sync" option is default. Though it might be >> different depending on the Linux distro in use? >> >> I use this: vers=3D3,defaults,auto,tcp,rsize=3D8192,wsize=3D8192 >> >> though I could get rid of the tcp as that's also a default. The rsize >> and wsize options are for running Jumbo Frames ie. larger MTU then >> 1500; in my case 9000 for 1Gbps links. > >These rsize and wsize options are pessimizations. They override the >default sizes which are usually much larger for tcp. Yes, for TCP, the FreeBSD client uses the largest size supported by the server, up to 128K (because MAXPHYS is set to that and, as such, that is the largest size safely supported by the buffer cache. I chose to make it this large by default for a couple of reasons: 1 - Solaris used 256K by default (and a maximum of 1Mbyte) back when it was Sun and their engineers were pretty good at this stuff. (I believe they argued that fewer RPCs implied lower server load for a given # of bytes. Usually the NFS engineering types have been concern= ed with server load and, therefore, the server's capacity and not the pe= rformance of a single client doing a single file write.) 2 - I don't do ZFS, but some thought that 128K would be a better I/O read/w= rite size for ZFS. Personally, since all I have for testing is 100Mbits/sec networking, I alwa= ys get "wire speed" and don't see any difference for different rsize/wsize ove= r TCP, so long as it is at least 16K. One case where large rsize/wsize plus a larger readahead setting should get better performance is when the network connection is a "long, fat pipe" such as a high bandwidth WAN connection. (Basically, you need to push a lot of bits down the TCP pipe before you wait for an RPC reply, to try and keep the long, fat pipe filled. In theory NFSv4 was meant for the Internet. Does anyone use it on WAN links. Probably yes, but not typically. I have no idea what Linux uses, except that packet traces often show page size (4K) I/O sizes, but not always. For UDP, I think the FreeBSD default is 16K for NFSv3 (UDP is not allowed f= or NFSv4 since congestion control at the transport level is required by the RF= Cs). Congestion control and reliability is why I always use TCP and, again,= for 100Mbit/sec networking, I see wire speed. Both Linux and Solaris use T= CP by default for NFSv3 mounts, which is mainly why it is the default for F= reeBSD too. =20 >The defaults are not documented in the man page, and the current >settings are almost equally impossible to see (e.g., mount -v doesn't >how them). The defaults are not quite impossible to see in the source >code of course, but the source code for them is especially convoluted. For FreeBSD, "nfsstat -m" on the client shows what is actually being used. (I think Linux has a similar option, but I can't remember for sure?) [lots of good stuff snipped] rick=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR0101MB1042F229495A61D98F81A5DDDDBA0>