From owner-cvs-all@FreeBSD.ORG  Fri Oct 12 20:46:03 2007
Return-Path: <owner-cvs-all@FreeBSD.ORG>
Delivered-To: cvs-all@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6C61916A419;
	Fri, 12 Oct 2007 20:46:03 +0000 (UTC)
	(envelope-from kris@FreeBSD.org)
Received: from weak.local (hub.freebsd.org [IPv6:2001:4f8:fff6::36])
	by mx1.freebsd.org (Postfix) with ESMTP id 8924913C461;
	Fri, 12 Oct 2007 20:46:00 +0000 (UTC)
	(envelope-from kris@FreeBSD.org)
Message-ID: <470FDD06.4090904@FreeBSD.org>
Date: Fri, 12 Oct 2007 22:45:58 +0200
From: Kris Kennaway <kris@FreeBSD.org>
User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728)
MIME-Version: 1.0
To: Mohan Srinivasan <mohans@FreeBSD.org>
References: <200710121912.l9CJCLeI085992@repoman.freebsd.org>
In-Reply-To: <200710121912.l9CJCLeI085992@repoman.freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org,
	cvs-src@FreeBSD.org
Subject: Re: cvs commit: src/sys/nfsclient nfs.h nfs_socket.c nfs_subs.c 
 nfsmount.h
X-BeenThere: cvs-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the entire tree <cvs-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-all>
List-Post: <mailto:cvs-all@freebsd.org>
List-Help: <mailto:cvs-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Oct 2007 20:46:03 -0000

Mohan Srinivasan wrote:
> mohans      2007-10-12 19:12:21 UTC
> 
>   FreeBSD src repository
> 
>   Modified files:
>     sys/nfsclient        nfs.h nfs_socket.c nfs_subs.c nfsmount.h 
>   Log:
>   NFS MP scaling changes.
>   - Eliminate the hideous nfs_sndlock that serialized NFS/TCP request senders
>     thru the sndlock.
>   - Institute a new nfs_connectlock that serializes NFS/TCP reconnects. Add
>     logic to wait for pending request senders to finish sending before
>     reconnecting. Dial down the sb_timeo for NFS/TCP sockets to 1 sec.
>   - Break out the nfs xid manipulation under a new nfs xid lock, rather than
>     over loading the nfs request lock for this purpose.
>   - Fix some of the locking in nfs_request.
>   Many thanks to Kris Kennaway for his help with this and for initiating the
>   MP scaling analysis and work. Kris also tested this patch thorougly.
>   Approved by: re@ (Ken Smith)

For the benefit of others: this change improved throughput by about 10% 
at high I/O loads with a dual core client, and by a factor of 10 on an 8 
core client (this was mostly the home-brew nfs_sndlock, which mohan 
correctly describes :-).

Mohan's previous commit that increases the nfs server socket buffer size 
is also very important for NFS performance.  Without it I was only 
getting 1-2MB/sec throughput over 10Gb ethernet with UDP mounts, because 
the minuscule 32kb socket buffer was constantly filling up and forcing 
retransmits.  With the new default of 256KB I still get full buffers 
with 10ge, so you may need to increase this further to eliminate this 
problem.  It might be OK for gige speeds, although I was still seeing 
some buffer full events, so maybe we should consider increasing the 
default sockbuf size to 512KB or so if this is widespread.

As a side comment there is a bug in either the nfs client or server that 
corrupts I/O when there is packet loss with UDP mounts (the default). 
TCP mounts handle this at the TCP layer.

Kris