From owner-freebsd-fs@freebsd.org Mon Jan 4 01:37:37 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3B93AA60039 for ; Mon, 4 Jan 2016 01:37:37 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id E00DF19F5 for ; Mon, 4 Jan 2016 01:37:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:4hdGVRUc/69CWkNCn2z5Uo6IP4TV8LGtZVwlr6E/grcLSJyIuqrYZhOCt8tkgFKBZ4jH8fUM07OQ6PC+HzRYqb+681k8M7V0HycfjssXmwFySOWkMmbcaMDQUiohAc5ZX0Vk9XzoeWJcGcL5ekGA6ibqtW1aJBzzOEJPK/jvHcaK1oLsh770o8WbSj4LrQT+SIs6FA+xowTVu5teqqpZAYF19CH0pGBVcf9d32JiKAHbtR/94sCt4MwrqHwI6LoJvvRNWqTifqk+UacQTHF/azh0t4XXskyJaAqM5nIdVi0q1FAAVw3Erw36Q5HZuy/2v+w70S2VMMfsRPY/XjH0vIlxTxq9siYMNHYc+WrUjsF1xPZBpRuqpBhyxqbJZ46IOf5mfuXWdIVJFiJ6Qs9NWnkZUcuHZIwVAr9EZL4Aog== X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DQAQCIzIlW/61jaINehAxtBohTs3EBDYFkGAqFI0oCgUwUAQEBAQEBAQGBCYItggcBAQEDAQEBASAEJyALBQsCAQgYAgINGQICJwEJJgIECAcEARwEiAYIDq5YkHwBAQEBAQEBAQIBAQEBAQEBGASBAYVVhH+EMAcBAQUCFYMegUkFjjCIVoVAhSSESYduhTGKR4NxAiABAUKCERyBeyA0B4M+AQgXI4EIAQEB X-IronPort-AV: E=Sophos;i="5.20,518,1444708800"; d="scan'208";a="259468336" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 03 Jan 2016 20:37:13 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 0460715F55D; Sun, 3 Jan 2016 20:37:14 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id uVmf4gnFGM5d; Sun, 3 Jan 2016 20:37:13 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 6F9BF15F565; Sun, 3 Jan 2016 20:37:13 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 4s9y7xGCsDvv; Sun, 3 Jan 2016 20:37:13 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 4404715F55D; Sun, 3 Jan 2016 20:37:13 -0500 (EST) Date: Sun, 3 Jan 2016 20:37:13 -0500 (EST) From: Rick Macklem To: "Mikhail T." Cc: Karli =?utf-8?Q?Sj=C3=B6berg?= , freebsd-fs@FreeBSD.org Message-ID: <495055121.147587416.1451871433217.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <5688D3C1.90301@aldan.algebra.com> References: <8291bb85-bd01-4c8c-80f7-2adcf9947366@email.android.com> <5688D3C1.90301@aldan.algebra.com> Subject: Re: NFS reads vs. writes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF43 (Win)/8.0.9_GA_6191) Thread-Topic: NFS reads vs. writes Thread-Index: wOzouO72fuKBfLd+9lmyD/JPSvSqJA== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jan 2016 01:37:37 -0000 Mikhail T. wrote: > On 03.01.2016 02:16, Karli Sj=C3=B6berg wrote: > > > > The difference between "mount" and "mount -o async" should tell you if > > you'd benefit from a separate log device in the pool. > > > This is not a ZFS problem. The same filesystem is being read in both > cases. The same data is being read from and written to the same > filesystems. For some reason, it is much faster to read via NFS than to > write to it, however. >=20 This issue isn't new. It showed up when Sun introduced NFS in 1985. NFSv3 did change things a little, by allowing UNSTABLE writes. Here's what an NFSv3 or NFSv4 client does when writing: - Issues some # of UNSTABLE writes. The server need only have these is serv= er RAM before replying NFS_OK. - Then the client does a Commit. At this point the NFS server is required t= o store all the data written in the above writes and related metadata on st= able storage before replying NFS_OK. --> This is where the "sync" vs "async" is a big issue. If you use "sync= =3Ddisabled" (I'm not a ZFS guy, but I think that is what the ZFS option looks lik= es) you *break* the NFS protocol (ie. violate the RFC) and put your data at s= ome risk, but you will typically get better (often much better) write performan= ce. OR You put a ZIL on a dedicated device with fast write performance, so t= he data can go there to satisfy the stable storage requirement. (I know nothi= ng about them, but SSDs have dramatically different write performance, s= o an SSD to be used for a ZIL must be carefully selected to ensure good write = performance.) How many writes are in "some #" is up to the client. For FreeBSD clients, t= he "wcommitsize" mount option can be used to adjust this. Recently the default tuning of thi= s changed significantly, but you didn't mention how recent your system(s) are, so man= ual tuning of it may be useful. (See "man mount_nfs" for more on this.) Also, the NFS server was recently tweaked so that it could handle 128K rsiz= e/wsize, but the FreeBSD client is limited to MAXBSIZE and this has not been increas= ed beyond 64K. To do so, you have to change the value of this in the kernel so= urces and rebuild your kernel. (The problem is that increasing MAXBSIZE makes the= kernel use more KVM for the buffer cache and if a system isn't doing significant c= lient side NFS, this is wasted.) Someday, I should see if MAXBSIZE can be made a TUNABLE, but I haven't done= that. --> As such, unless you use a Linux NFS client, the reads/writes will be 64= K, whereas 128K would work better for ZFS. Some NAS hardware vendors solve this problem by using non-volatile RAM, but= that isn't available in generic hardware. > And finally, just to put the matter to rest, both ZFS-pools already have > a separate zil-device (on an SSD). >=20 If this SSD is dedicated to the ZIL and is one known to have good write per= formance, it should help, but in your case the SSD seems to be the bottleneck. rick > -mi >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"