From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 29 06:00:40 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5486E97A
 for <freebsd-fs@freebsd.org>; Fri, 29 Nov 2013 06:00:40 +0000 (UTC)
Received: from chez.mckusick.com (chez.mckusick.com
 [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 113C21F87
 for <freebsd-fs@freebsd.org>; Fri, 29 Nov 2013 06:00:40 +0000 (UTC)
Received: from chez.mckusick.com (localhost [127.0.0.1])
 by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id rAT60aff046648;
 Thu, 28 Nov 2013 22:00:36 -0800 (PST)
 (envelope-from mckusick@chez.mckusick.com)
Message-Id: <201311290600.rAT60aff046648@chez.mckusick.com>
To: Konstantin Belousov <kostikbel@gmail.com>
Subject: Re: RFC: NFS client patch to reduce sychronous writes 
In-reply-to: <20131128071821.GH59496@kib.kiev.ua> 
Date: Thu, 28 Nov 2013 22:00:36 -0800
From: Kirk McKusick <mckusick@mckusick.com>
Cc: FreeBSD FS <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 29 Nov 2013 06:00:40 -0000

> Date: Thu, 28 Nov 2013 09:18:21 +0200
> From: Konstantin Belousov <kostikbel@gmail.com>
> To: Kirk McKusick <mckusick@mckusick.com>
> Cc: Rick Macklem <rmacklem@uoguelph.ca>, FreeBSD FS <freebsd-fs@freebsd.org>
> Subject: Re: RFC: NFS client patch to reduce sychronous writes
> 
> On Wed, Nov 27, 2013 at 03:20:14PM -0800, Kirk McKusick wrote:
>> The ``fix'' of bzero'ing every buffer cache page was made to UFS/FFS
>> for this problem and it killed write performance of the filesystem
>> by nearly half. We corrected this by only doing the bzero when the
>> file is mmap'ed which helped things considerably (since most files
>> being written are not also bmap'ed).
> 
> I am not sure that I follow.
> 
> For UFS, leaving any part of the buffer with undefined garbage would
> cause the garbage to appear on the next mmap(2), since page in is
> implemented as translation of the file offsets into disk offsets and
> than reading disk blocks. The read always fetch full page. UFS cannot
> know if the file would be mapped sometime in future, or after the
> reboot.
> 
> In fact, UFS is quite plentiful WRT zeroing buffers on write. It is easy
> to see almost all places where it is done, by searching for BA_CLRBUF
> flag for UFS_BALLOC(). UFS does perform the optimization of _trying_ to
> not clear newly allocated buffer on write if uio covers the whole buffer
> range. Still, on error it falls back to clearing, which is performed by
> vfs_bio_clrbuf() call in ffs_write().

You are entirely correct in your analysis. The original "fix" was to always
clear every buffer even when it was being completely filled (which is the
most common case). I changed the filling completely case to first try the
copyin and only zeroing it when the copyin fails. Making that change nearly
doubled the the speed of bulk writes.

	~Kirk