Date: Sun, 30 Jun 2019 00:45:58 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Alan Somers <asomers@freebsd.org> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@FreeBSD.org>, Sean Fagan <sef@ixsystems.com> Subject: Re: RFC: What should a copy_file_range(2) syscall do by default? Message-ID: <YTXPR01MB02857724AE08766B45D0A608DDFE0@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <CAOtMX2j0GPhMQ5E6nuU1f0Qiwr_G7ASAG_btf05fb3yFZp_XRw@mail.gmail.com> References: <YTXPR01MB0285B40A9D9A6BD1DC144A64DDE60@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM>, <CAOtMX2j0GPhMQ5E6nuU1f0Qiwr_G7ASAG_btf05fb3yFZp_XRw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Well, asomers@ prefers the current patch, despite its complexity. sef@ expressed concerns w.r.t. the complexity and it being done in the kernel. However he did not state a preference for any other specific variant, such as the ones I suggested. As such, I am sticking with the current patch unless I hear otherwise from others. Thanks for your comments, rick ps: I felt a top post was reasonable here, as it summarized separate posts in the thread. ________________________________________ From: Alan Somers <asomers@freebsd.org> Sent: Saturday, June 22, 2019 1:28:03 PM To: Rick Macklem Cc: freebsd-fs@freebsd.org; Sean Fagan Subject: Re: RFC: What should a copy_file_range(2) syscall do by default? On Sat, Jun 22, 2019 at 10:02 AM Rick Macklem <rmacklem@uoguelph.ca> wrote: > > Hi, > > sef@ made this comment on phabricator. I don't believe phabricator is the= correct > place for "big picture" discussions, so I'm posting it here (I'm assuming= sef@ doesn't > mind, since the phabricator comments are public). > sef@ wrote: > >This much work in the kernel for what //should// be user-space makes me = twitchy... >but there is lots of precedent for it, so I obviously have to g= et with the times. > > > > I've done a quick review of the code; it seems most of the complexity = is in the hole->detection. I'm also annoyed that linux used size_t for the= amount to copy, when >off_t would have been more appropriate. But not muc= h to do about that now. > > > > Having a default implementation means that user-space can't fall back = if it's not >supported, and do it better (e.g., parallel I/O). Should we a= lso have a pathconf for >the feature? > > > > WRT your question on -fs, I have no objections to this working cross-f= ilesystem, >although I think I might ask to have a flag to fail in that cas= e. > > Well, all I am interested in is a system call/VOP call so the NFSv4.2 cli= ent can do > a file copy locally on the NFS server instead of doing Reads/Writes acros= s the wire. > The current code has gotten fairly complex, so I'll try and ask "how comp= lex" this > syscall/VOP call should be? > > The range of variants I can think of are: > 0) - Don't do it at all. > 1) - The syscall could just do a VOP_COPY_FILE_RANGE() and return whateve= r error > it returns. > --> This implies an error return for all file systems for now, wi= th support for > NFSv4.2mounts being added later (FreeBSD13 hopefully). This option would require applications or the C library to fallback to a copy loop. While doable, nothing in userland would be able to range-lock the file, making the copy loop non-atomic. So the in-kernel copy is superior. > 2) - The syscall could fall back on a simple copy loop, but not try to de= al with holes. > --> The Linux man page mentions using copy_file_range(2) in a loop= with > lseek(SEEK_DATA)/lseek(SEEK_HOLE) for sparse files. This sug= gests that > the Linux fallback code doesn't try to handle holes. Same problem as 1. Or if you do the copy loop in-kernel it would waste CPU time and expand sparse files, which isn't good either. > 3) - The current patch which tries to handle holes and copy the entire by= te range > in one call. Definitely the best option, despite its complexity. I would argue that the complexity calls for a robust test suite, rather than abandoning the feature. -Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB02857724AE08766B45D0A608DDFE0>