Date: Sat, 22 Jun 2019 16:01:57 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: "freebsd-fs@freebsd.org" <freebsd-fs@FreeBSD.org> Cc: Sean Fagan <sef@ixsystems.com>, Alan Somers <asomers@freebsd.org> Subject: RFC: What should a copy_file_range(2) syscall do by default? Message-ID: <YTXPR01MB0285B40A9D9A6BD1DC144A64DDE60@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM>
next in thread | raw e-mail | index | archive | help
Hi, sef@ made this comment on phabricator. I don't believe phabricator is the c= orrect place for "big picture" discussions, so I'm posting it here (I'm assuming s= ef@ doesn't mind, since the phabricator comments are public). sef@ wrote: >This much work in the kernel for what //should// be user-space makes me tw= itchy... >but there is lots of precedent for it, so I obviously have to get= with the times. > =20 > I've done a quick review of the code; it seems most of the complexity is= in the hole->detection. I'm also annoyed that linux used size_t for the a= mount to copy, when >off_t would have been more appropriate. But not much = to do about that now. > =20 > Having a default implementation means that user-space can't fall back if= it's not >supported, and do it better (e.g., parallel I/O). Should we als= o have a pathconf for >the feature? > =20 > WRT your question on -fs, I have no objections to this working cross-fil= esystem, >although I think I might ask to have a flag to fail in that case. Well, all I am interested in is a system call/VOP call so the NFSv4.2 clien= t can do a file copy locally on the NFS server instead of doing Reads/Writes across = the wire. The current code has gotten fairly complex, so I'll try and ask "how comple= x" this syscall/VOP call should be? The range of variants I can think of are: 0) - Don't do it at all. 1) - The syscall could just do a VOP_COPY_FILE_RANGE() and return whatever = error it returns. --> This implies an error return for all file systems for now, with= support for=20 NFSv4.2mounts being added later (FreeBSD13 hopefully). 2) - The syscall could fall back on a simple copy loop, but not try to deal= with holes. --> The Linux man page mentions using copy_file_range(2) in a loop w= ith lseek(SEEK_DATA)/lseek(SEEK_HOLE) for sparse files. This sugge= sts that the Linux fallback code doesn't try to handle holes. 3) - The current patch which tries to handle holes and copy the entire byte= range in one call. As sef@ mentions, there is also the question of handling copying across mul= tiple file systems. I asked about this before and I only got the one response, wh= ich was "do it". I have seen a discussion of adding cross-mount to the syscall for = Linux, but I don't know if/when the Linux one might support that. (They have not creat= ed a "flag" option for this, as far as I've seen.) It happens without additional complexity for #2 and #3 above. Linux discussions have talked about improved performance for local file sys= tems based on reduced # of system calls, but I have not seen any data to show wh= at, if any, performance improvement has been observed. (The slow hardware I hav= e to test on won't be useful for performance evaluation.) So, what do others think w.r.t. the above? rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB0285B40A9D9A6BD1DC144A64DDE60>