Date: Sun, 21 Aug 2022 22:19:56 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Konstantin Belousov <kostikbel@gmail.com> Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org> Subject: Re: SEEK_DATA/SEEK_HOLE with vnode locked Message-ID: <YT4PR01MB9736D37382DF089F784C3C82DD6E9@YT4PR01MB9736.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <YwJXuz5DsuOmyA6t@kib.kiev.ua> References: <YQBPR0101MB97420AD41791E544519A0A2DDD659@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM> <YvQ7MYXPl0AugojS@kib.kiev.ua> <YT4PR01MB9736B24FDE64C945C2C9EC8EDD6F9@YT4PR01MB9736.CANPRD01.PROD.OUTLOOK.COM> <YwJXuz5DsuOmyA6t@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Konstantin Belousov <kostikbel@gmail.com> wrote:=0A= > On Sun, Aug 21, 2022 at 12:02:48AM +0000, Rick Macklem wrote:=0A= > > Just to summarize this...=0A= > > I was able to do a VOP_SEEK() which would be called with a=0A= > > LK_SHARED locked vnode and it seemed to work fine.=0A= > >=0A= > > However, ReadPlus (which is like Read, but allows for=0A= > > holes to be represented as <offset, length> in the reply=0A= > > instead of a stream of 0 bytes) seems to be a performance=0A= > > dud.=0A= > >=0A= > > I was surprised how poorly it performed compares to ordinary=0A= > > Read. Typically it would take 60% longer to read a file. I tried=0A= > > sparse and non-sparse files of various sizes and they always=0A= > > took longer. (If I disabled SEEK_DATA/SEEK_HOLE in the server=0A= > > code, so it never actually did holes, it worked comparably to=0A= > > regular Read, so somehow the overhead of doing SEEK_DATA/SEEK_HOLE=0A= > > was a big performance hit. It was using LK_SHARED locks, so=0A= > > it wasn't serializing the reads, but I don't really know why it=0A= > > performed so poorly?)=0A= > What filesystem did you used on server?=0A= The 60% slower was for tests like this with UFS:=0A= - I created a file with a 1Gbyte hole, followed by 1Gbyte of data.=0A= - Then I read the file with "time dd if=3D<file> of=3D/dev/null bs=3D10M"= =0A= after remounting over NFS (to avoid NFS client caching).=0A= Here's the elapsed time for 4 runs for a UFS exported fs:=0A= Read ReadPlus=0A= 20.4, 4.3, 4.6, 4.3 18.7, 7.6, 7.7, 7.3=0A= (The first run was right after booting, so there was nothing=0A= cached within UFS.)=0A= --> So, as you can see, it took about 60% longer via ReadPlus.=0A= =0A= Now, what about the same test on an exported ZFS fs:=0A= Read ReadPlus=0A= 6.4, 5.7, 5.6, 5.4 110.8, 113.3, 110.7, 110.9=0A= --> Yep, only about 20 times (or 2000% longer).=0A= =0A= For a kernel build over NFS, it took about 70% longer=0A= when on a ZFS exported fs (I can't remember the UFS=0A= number, but it was significantly longer.)=0A= =0A= So, yes, ZFS is a lot worse, but UFS is bad enough that=0A= I can't imagine anyone using ReadPlus instead of ordinary=0A= Read?=0A= =0A= LANs have gobs of bandwidth these days. WANs might=0A= benefit from the lack of long streams of 0 bytes, but some=0A= (like my little DSL modem for my internet connection) will=0A= compress them out anyhow, I think?=0A= =0A= > >=0A= > > Anyhow, unless the performance issue gets resolved, there is=0A= > > no reason to commit the code to FreeBSD's main.=0A= > > (NFSv4.2 operations, like ReadPlus, are all optional and are not=0A= > > required for an RFC conformant implementation.)=0A= > =0A= > Why not commit? It might make sense to add it, but guard under some=0A= > knob.=0A= Commit it with a "never use this, performance is terrible" doesn't=0A= make a lot of sense to me, unless the ZFS performance issue=0A= were somehow resolved.=0A= =0A= I am now actually concerned about copy_file_range(2), which uses=0A= SEEK_HOLE/SEEK_DATA. There is a patch under review that at least=0A= increases the blocksize for ZFS, but the effect of disabling the use of=0A= SEEK_HOLE/SEEK_DATA in copy_file_range(2) also needs to be=0A= explored.=0A= --> Retaining holes as unallocated regions is nice, but at the very=0A= least, it could compare va_size with va_bytes to decide if there=0A= are holes worth looking for.=0A= =0A= rick=0A= =0A=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YT4PR01MB9736D37382DF089F784C3C82DD6E9>