From owner-freebsd-current@FreeBSD.ORG Sun Nov 18 23:28:30 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 295EF16A41B for ; Sun, 18 Nov 2007 23:28:30 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.179]) by mx1.freebsd.org (Postfix) with ESMTP id 0536A13C448 for ; Sun, 18 Nov 2007 23:28:29 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: by wa-out-1112.google.com with SMTP id k17so1883630waf for ; Sun, 18 Nov 2007 15:28:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=lVm1LRMMh4wj98OoTl4s4StuC/j/7K5eL/BkV0f9CHY=; b=IxNwsz9KSlI0LKP4jSNmEhSbluPkDngOtK8Cv1M69/5NUAQXrCLXHOIpPDC2Hgo0AnrUqBY2JWxsIqr5Ew+VQ9vzUghjpATDkQYrlj7ycirg7csZx60OydgJ2K9VrfIcxU4x+slARZoVUCw+Z5Xuv8UQayqpLl7BeZU2CgATv38= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=uFxSWCaAqiwFP2a6WgjY3UPo9x/5satvu382L6oAz8EmIcwGPoJkcqg3Nn+5PGfvOM76KtNCJWNmXPCee2qv+4hfz8pEP7x5mPsOm0rPWYGpn5swuaQQpKyFcFuzOUbUixC9MXPKeuAVHcqefvPLOXnQZi94N0Lvf2idLNgsRkQ= Received: by 10.114.14.1 with SMTP id 1mr345719wan.1195428491571; Sun, 18 Nov 2007 15:28:11 -0800 (PST) Received: by 10.114.13.15 with HTTP; Sun, 18 Nov 2007 15:28:11 -0800 (PST) Message-ID: Date: Sun, 18 Nov 2007 15:28:11 -0800 From: "Kip Macy" To: "Bjorn Gronvall" In-Reply-To: <20071118211131.7164edd8@ibook.sics.se> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <20071118211131.7164edd8@ibook.sics.se> Cc: freebsd-current@freebsd.org Subject: Re: Improving NFS write performance by a factor of 2. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Nov 2007 23:28:30 -0000 Could you do me a favor and submit this in the form of a PR and assign it to me? I'm not the most appropriate person for this but the main NFS developer is no longer working on FreeBSD and I don't want to see this dropped. -Kip On Nov 18, 2007 12:11 PM, Bjorn Gronvall wrote: > Hi, > > I'm not sure if people care about NFS write performance any longer but > if you do, please read on. > > A problem with the current NFS server is that it does not cluster > writes, this in turn leads to really poor sequential-write > performance. > > By enabling write clustering NFS write performance goes from > 26.6Mbyte/s to 54.3Mbyte/s or increases by a factor of 2. This is on a > SATA disk with write caching enabled (hw.ata.wc=3D1). > > If write caching is disabled performance still goes up from 1.6Mbyte/s > to 5.8Mbyte/s (or by a factor of 3.6). > > The attached patch (relative to current) makes the following changes: > > 1/ Rearrange the code so that the same code can be used to detect both > sequential read and write access. > > 2/ Merge in updates from vfs_vnops.c::sequential_heuristic. > > 3/ Use double hashing in order to avoid hash-clustering in the nfsheur > table. This change also makes it possible to reduce "try" from 32 > to 8. > > 4/ Pack the nfsheur table more efficiently. > > 5/ Tolerate reordered RPCs to some small amount (initially suggested > by Ellard and Seltzer). > > 6/ Back-off from sequential access rather than immediately switching to > random access (Ellard and Seltzer). > > 7/ To avoid starvation of the buffer pool call bwillwrite. The call is > issued after the VOP_WRITE in order to avoid additional reordering > of write operations. > > 8/ sysctl variables vfs.nfsrv.cluster_writes and cluster_reads to > enable or disable clustering. vfs.nfsrv.reordered_io counts the > number of reordered RPCs. > > 9/ In nfsrv_commit check for write errors and report them back to the > client. Also check if the RPC argument count is zero which means > that we must flush to the end of file according to the RFC. > > 10/ Two earlier commits broke the write gathering support: > > nfs_syscalls.c:1.71 > > This change removed NQNFS stuff but left the NQNFS variable > notstarted. This resulted in NFS write gathering effectively > being permanently disabled (regardless if NFSv2 or NFSv3). > > nfs_syscalls.c:1.103 > > This change disabled write gathering (again) for NFSv3 although > this should be controlled by vfs.nfs.nfsrvw_procrastinate_v3 !=3D > 0. > > Write gathering may still be useful with NFSv3 to put reordered write > RPCs into order, perhaps also for other reasons. This is now possible > again. > > The attached patch is for current but you will observe similar > improvements with earlier FreeBSD versions. If you would like to have > the same patch but for FreeBSD 5.x, 6.x or 7.0 please drop me a line. > > Cheers, > /b > > > -- > _ _ ,_______________. > Bjorn Gronvall (Bj=F6rn Gr=F6nvall) /_______________/| > Swedish Institute of Computer Science | || > PO Box 1263, S-164 29 Kista, Sweden | Schroedingers || > Email: bg@sics.se, Phone +46 -8 633 15 25 | Cat |/ > Cellular +46 -70 768 06 35, Fax +46 -8 751 72 30 '---------------' > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org= " >