Date: Thu, 16 Dec 2021 14:58:23 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Konstantin Belousov <kostikbel@gmail.com>, Rick Macklem <rmacklem@freebsd.org> Cc: "src-committers@freebsd.org" <src-committers@freebsd.org>, "dev-commits-src-all@freebsd.org" <dev-commits-src-all@freebsd.org>, "dev-commits-src-main@freebsd.org" <dev-commits-src-main@freebsd.org> Subject: Re: git: 867c27c23a5c - main - nfscl: Change IO_APPEND writes to direct I/O Message-ID: <YQXPR0101MB09680896DE973A2A414FC83BDD779@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <Ybq/1Iz4/Yu9Ibil@kib.kiev.ua> References: <202112151639.1BFGdS2v011996@gitrepo.freebsd.org> <Ybq/1Iz4/Yu9Ibil@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Kostik wrote: >On Wed, Dec 15, 2021 at 04:39:28PM +0000, Rick Macklem wrote: >> The branch main has been updated by rmacklem: >> >> URL: https://cgit.FreeBSD.org/src/commit/?id=867c27c23a5c469b27611cf53cc2390b5a193fa5 >> >> commit 867c27c23a5c469b27611cf53cc2390b5a193fa5 >> Author: Rick Macklem <rmacklem@FreeBSD.org> >> AuthorDate: 2021-12-15 16:35:48 +0000 >> Commit: Rick Macklem <rmacklem@FreeBSD.org> >> CommitDate: 2021-12-15 16:35:48 +0000 >> >> nfscl: Change IO_APPEND writes to direct I/O >> >> IO_APPEND writes have always been very slow over NFS, due to >> the need to acquire an up to date file size after flushing >> all writes to the NFS server. >> >> This patch switches the IO_APPEND writes to use direct I/O, >> bypassing the buffer cache. As such, flushing of writes >> normally only occurs when the open(..O_APPEND..) is done. >> It does imply that all writes must be done synchronously >> and must be committed to stable storage on the file server >> (NFSWRITE_FILESYNC). >> >> For a simple test program that does 10,000 IO_APPEND writes >> in a loop, performance improved significantly with this patch. >> >> For a UFS exported file system, the test ran 12x faster. >> This drops to 3x faster when the open(2)/close(2) are done >> for each loop iteration. >> For a ZFS exported file system, the test ran 40% faster. >> >> The much smaller improvement may have been because the ZFS >> file system I tested against does not have a ZIL log and >> does have "sync" enabled. >> >> Note that IO_APPEND write performance is still much slower >> than when done on local file systems. >> >> Although this is a simple patch, it does result in a >> significant semantics change, so I have given it a >> large MFC time. > >How is the buffer cache coherency is handled then? >Imagine that other process either reads from this file, or even have it >mapped. What does ensure that reads and page cache see the data written >by direct path? Well, for the buffer cache case, there is code near the beginning of ncl_write() (the NFS VOP_WRITE()) that calls ncl_vinvalbuf() for the IO_APPEND case. As such, any data in the buffer cache gets invalidated whenever an Append write occurs. But, now that I look at it, it does not do anything w.r.t. mmap'd files. (The direct I/O stuff has been there for a long time, but it isn't enabled by default, so it probably doesn't get tested much. Also, it has a sysctl that allows mmap for direct I/O, which is enabled by default. It causes getpage/putpage to fail if it is not enabled.) So, it looks like code to invalidate pages needs to be done along with the ncl_vinvalbuf()? --> I'll come up with a patch and then get you to review it. Thanks for pointing this out, rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB09680896DE973A2A414FC83BDD779>
