Date: Mon, 8 Mar 2004 00:25:42 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: noackjr@alumni.rice.edu Cc: Harald Schmalzbauer <h@schmalzbauer.de> Subject: Re: patch for Linux NFS client Message-ID: <200403080825.i288Pg6B017066@apollo.backplane.com> References: <40446EF2.5020901@alumni.rice.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
I think this one slipped through the net last october. I'll revisit it again this week (for DFly). There were no counter-indications, really, the discussion simply got off the baseline a bit into other NFS parameters that turned out to be non-issues. This issue seems to be Linux's desire to use absurd default data block sizes for NFS, combined with a bad delayed-ack handling algorithm that can't handle the edge case. I'm sure they did this to satisfy some silly bulk benchmark but, generally speaking, using huge data block sizes can play merry hell with NFSD's socket interlocks. Even so, it might be a good idea for us to use a significantly larger soreserve value, or to increase the buffer limit when a large data block size is negotiated. Instead of adding a slop of 2048 (aka 32768 + 404 + 2048 = 35220 bytes) it might be better to set the soreserve value to 65535 by default. Making it programmable is a good idea in any case though I would make the sysctl parameter the total rather then the slop (so the sysop can decide whether to exceed 65535 which can tickle window scaling bugs on client or server), and use the current settings as a hard minimum. Generally speaking the TCP buffer ought to be large enough to buffer at least two full-sized NFS data packets to reduce NFSD interlock stalls when combined with read-ahead. -Matt :I happened upon the DragonFlyBSD diary [1] and saw an entry about NFS :performance improvements. After some digging I came across a patch from :David Rhodus to increase NFS performance between Linux clients and :(Free|DragonFly)BSD servers [2]. The patch doesn't appear to have been :committed to DragonFlyBSD, so there may be problems with it. This issue :was diagnosed and reported to hackers@ last September by Richard Sharpe :[3]. In any case, I've seen the Linux/FreeBSD NFS issue pop up [4,5], :so maybe this will help a bit. Attached is a version of the patch I :modified for FreeBSD (to correct line numbers and whitespace). Credit :should go to David Rhodus (and Richard Sharpe). : :I'm not running NFS at present (and don't have any Linux machines :anyway), SO THIS PATCH HAS NOT BEEN TESTED IN ANY WAY. If it works, :great. Also, it probably deserves mention in the man page. : :I'm sure there are other fixes that are applicable to FreeBSD, but I :don't have the know-how at this point to determine which. : :[1] DragonFlyBSD diary: :http://www.dragonflybsd.org/status/diary.cgi : :[2] Patch by David Rhodus posted to the DragonFlyBSD Digest: :http://www.shiningsilence.com/dbsdlog/archives/000063.html : :[3] Post by Richard Sharpe to hackers@ last September: :http://lists.freebsd.org/pipermail/freebsd-hackers/2003-September/003269.html : :[4] Linux/FreeBSD NFS issue: :http://lists.freebsd.org/pipermail/freebsd-current/2004-February/021546.html :http://66.102.7.104/search?q=cache:http://www.richardsharpe.com/ethereal-stuff.html#Time%20Sequence%20Graphs : :[5] PRs concerning NFS with Linux (among the many NFS PRs): :http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/56461 :http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/56500 : :Jon Noack : :--------------060807040300030002050904 :Content-Type: text/plain; : name="nfs_syscalls.c.diff" :Content-Transfer-Encoding: 7bit :Content-Disposition: inline; : filename="nfs_syscalls.c.diff" : :--- sys/nfsserver/nfs_syscalls.c.orig Fri Nov 7 16:57:09 2003 :+++ sys/nfsserver/nfs_syscalls.c Tue Mar 2 03:11:48 2004 :@@ -100,6 +100,9 @@ : &nfsrvw_procrastinate, 0, ""); : SYSCTL_INT(_vfs_nfsrv, OID_AUTO, gatherdelay_v3, CTLFLAG_RW, : &nfsrvw_procrastinate_v3, 0, ""); :+static int sacksize = 2048; :+SYSCTL_INT(_vfs_nfsrv, OID_AUTO, sacksize, CTLFLAG_RW, :+ &sacksize, 0, ""); :..
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200403080825.i288Pg6B017066>