Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Jun 2019 16:01:57 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        "freebsd-fs@freebsd.org" <freebsd-fs@FreeBSD.org>
Cc:        Sean Fagan <sef@ixsystems.com>, Alan Somers <asomers@freebsd.org>
Subject:   RFC: What should a copy_file_range(2) syscall do by default?
Message-ID:  <YTXPR01MB0285B40A9D9A6BD1DC144A64DDE60@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM>

next in thread | raw e-mail | index | archive | help
Hi,

sef@ made this comment on phabricator. I don't believe phabricator is the c=
orrect
place for "big picture" discussions, so I'm posting it here (I'm assuming s=
ef@ doesn't
mind, since the phabricator comments are public).
sef@ wrote:
>This much work in the kernel for what //should// be user-space makes me tw=
itchy... >but there is lots of precedent for it, so I obviously have to get=
 with the times.
> =20
>  I've done a quick review of the code; it seems most of the complexity is=
 in the hole->detection.  I'm also annoyed that linux used size_t for the a=
mount to copy, when >off_t would have been more appropriate.  But not much =
to do about that now.
> =20
>  Having a default implementation means that user-space can't fall back if=
 it's not >supported, and do it better (e.g., parallel I/O).  Should we als=
o have a pathconf for >the feature?
> =20
>  WRT your question on -fs, I have no objections to this working cross-fil=
esystem, >although I think I might ask to have a flag to fail in that case.

Well, all I am interested in is a system call/VOP call so the NFSv4.2 clien=
t can do
a file copy locally on the NFS server instead of doing Reads/Writes across =
the wire.
The current code has gotten fairly complex, so I'll try and ask "how comple=
x" this
syscall/VOP call should be?

The range of variants I can think of are:
0) - Don't do it at all.
1) - The syscall could just do a VOP_COPY_FILE_RANGE() and return whatever =
error
        it returns.
        --> This implies an error return for all file systems for now, with=
 support for=20
              NFSv4.2mounts being added later (FreeBSD13 hopefully).
2) - The syscall could fall back on a simple copy loop, but not try to deal=
 with holes.
       --> The Linux man page mentions using copy_file_range(2) in a loop w=
ith
             lseek(SEEK_DATA)/lseek(SEEK_HOLE) for sparse files. This sugge=
sts that
             the Linux fallback code doesn't try to handle holes.
3) - The current patch which tries to handle holes and copy the entire byte=
 range
       in one call.

As sef@ mentions, there is also the question of handling copying across mul=
tiple
file systems. I asked about this before and I only got the one response, wh=
ich was
"do it". I have seen a discussion of adding cross-mount to the syscall for =
Linux, but
I don't know if/when the Linux one might support that. (They have not creat=
ed
a "flag" option for this, as far as I've seen.)
It happens without additional complexity for #2 and #3 above.

Linux discussions have talked about improved performance for local file sys=
tems
based on reduced # of system calls, but I have not seen any data to show wh=
at,
if any, performance improvement has been observed. (The slow hardware I hav=
e
to test on won't be useful for performance evaluation.)

So, what do others think w.r.t. the above? rick





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB0285B40A9D9A6BD1DC144A64DDE60>