Date: Thu, 7 Aug 2025 11:22:41 -0600 From: Alan Somers <asomers@freebsd.org> To: Rick Macklem <rick.macklem@gmail.com> Cc: Alexander Motin <mav@freebsd.org>, FreeBSD CURRENT <freebsd-current@freebsd.org> Subject: Re: RFC: Does ZFS block cloning do this? Message-ID: <CAOtMX2ha6pR7=zZKs9PttetRORA6OT1ywNFLVXVGvQ1hUH1OgA@mail.gmail.com> In-Reply-To: <CAM5tNy7JfRry0%2BPkz-xFQuiXGfq0hTxVjBW4SZWc8Fy1=PJhqQ@mail.gmail.com> References: <CAM5tNy7V7Btem%2ByWNK7oyn9qsk6TrQwuGo1kxqhCstLM4_uh9g@mail.gmail.com> <CAOtMX2jGcQY_AywWv1tVBbAk%2BrOheya%2BHRQBMRDc7ELGrA7qNA@mail.gmail.com> <CAM5tNy6PJbTnjf24L0Y9j5NicBTZHDBKp%2BaF-VhOLCsaY5Qbnw@mail.gmail.com> <8925b735-8398-4e0f-95f7-8d1115413013@FreeBSD.org> <CAM5tNy5HcgovNrB52zZ4W2p6Ur7VBX9ZZkX74Y4rkef%2B2evt0Q@mail.gmail.com> <CAM5tNy7JfRry0%2BPkz-xFQuiXGfq0hTxVjBW4SZWc8Fy1=PJhqQ@mail.gmail.com>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] On Thu, Aug 7, 2025 at 8:32 AM Rick Macklem <rick.macklem@gmail.com> wrote: > On Wed, Aug 6, 2025 at 9:46 AM Rick Macklem <rick.macklem@gmail.com> > wrote: > > > > On Wed, Aug 6, 2025 at 9:28 AM Alexander Motin <mav@freebsd.org> wrote: > > > > > > Hi Rick, > > > > > > On 8/6/25 11:54, Rick Macklem wrote: > > > > The difference for NFSv4.2 is that CLONE cannot return with partial > completion. > > > > (It assumes that a CLONE of any size will complete quickly enough > for an RPC. > > > > Although there is no fixed limit, most assume an RPC reply should > happen in > > > > 1-2sec at most. For COPY, the server can return with only part of the > > > > copy done.) > > > > It also includes alignment restrictions for the byte offsets. > > > > > > > > There is also the alignment restriction on CLONE. There doesn't seem > to be > > > > an alignment restriction on zfs_clone_range(), but maybe it is > buried inside it? > > > > I think adding yet another pathconf name to get the alignment > requirement and > > > > whether or not the file system supports it would work without any > VOP change. > > > > > > The semantics you describe looks similar to Linux FICLONE/FICLONERANGE > > > calls, that got adopted there before copy_file_range(). IIRC those > > > effectively mean -- clone the file or its range as requested or fail. > I > > > am not sure why some people prefer those calls, explicitly not allowing > > > fallback to copy, but theere are some, for example Veeam backup fails > if > > > ZFS rejects the cloning request for any reason. For Linux ZFS has a > > > separate code (see zpl_remap_file_range() and respective VFS calls) > > > wrapping around block cloning to implement this semantics. FreeBSD > does > > > not have the equivalent at this point, but it would be trivial to add, > > > if we really need those VOPs. > > For NFSv4.2 (which I suspect was modelled after what Linux does) the > > difference is the ability to complete the entire "copy" within 1-2sec > under > > normal circumstances. > > --> The NFSv4.2 CLONE operation requires this. > > whereas for the NFSv4.2 COPY > > --> It is allowed to return after a partial completion to adhere to the > 1-2sec > > rule. This probably does not affect ZFS, but it is needed for > > the "in general" > > UFS case. > > > > There may be no difference needed for zfs_copy_file_range(). So long as > it > > never returns after a partial completion. If it does return after > > partial completion, > > a flag would indicate "must complete it". > > > > As for FreeBSD syscalls, I don't see a need for a new one. > > I'll leave that up to others. > > pathconf(2) could be used to determine if cloning is supported. > > > > Thanks for all the comments. It looks like a new "kernel only" flag for > > VOP_COPY_FILE_RANGE() and a new name for VOP_PATHCONF() > > should be all that is needed. > So, this seems almost too easy? > > What I am thinking of (and should be easy to do in the next few days > for 15.0) is: > - Define a new pathconf variable _PC_CLONE_BLKSIZE which returns > the blksize for cloning or 0 if cloning is not supported. > - Define a new flag for copy_file_range() called COPY_FILE_RANGE_CLONE > which, if set, would require that the entire copy be completed via > cloning > (no partial copy allowed) or return ENOSYS if the file system does not > support this. > Expose this flag to userland in case any application really needs > cloning. > The code changes outside of NFS are trivial. > > So, how does this sound? ric Yes, I think that would work. [-- Attachment #2 --] <div dir="ltr"><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Thu, Aug 7, 2025 at 8:32 AM Rick Macklem <<a href="mailto:rick.macklem@gmail.com">rick.macklem@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Wed, Aug 6, 2025 at 9:46 AM Rick Macklem <<a href="mailto:rick.macklem@gmail.com" target="_blank">rick.macklem@gmail.com</a>> wrote:<br> ><br> > On Wed, Aug 6, 2025 at 9:28 AM Alexander Motin <<a href="mailto:mav@freebsd.org" target="_blank">mav@freebsd.org</a>> wrote:<br> > ><br> > > Hi Rick,<br> > ><br> > > On 8/6/25 11:54, Rick Macklem wrote:<br> > > > The difference for NFSv4.2 is that CLONE cannot return with partial completion.<br> > > > (It assumes that a CLONE of any size will complete quickly enough for an RPC.<br> > > > Although there is no fixed limit, most assume an RPC reply should happen in<br> > > > 1-2sec at most. For COPY, the server can return with only part of the<br> > > > copy done.)<br> > > > It also includes alignment restrictions for the byte offsets.<br> > > ><br> > > > There is also the alignment restriction on CLONE. There doesn't seem to be<br> > > > an alignment restriction on zfs_clone_range(), but maybe it is buried inside it?<br> > > > I think adding yet another pathconf name to get the alignment requirement and<br> > > > whether or not the file system supports it would work without any VOP change.<br> > ><br> > > The semantics you describe looks similar to Linux FICLONE/FICLONERANGE<br> > > calls, that got adopted there before copy_file_range(). IIRC those<br> > > effectively mean -- clone the file or its range as requested or fail. I<br> > > am not sure why some people prefer those calls, explicitly not allowing<br> > > fallback to copy, but theere are some, for example Veeam backup fails if<br> > > ZFS rejects the cloning request for any reason. For Linux ZFS has a<br> > > separate code (see zpl_remap_file_range() and respective VFS calls)<br> > > wrapping around block cloning to implement this semantics. FreeBSD does<br> > > not have the equivalent at this point, but it would be trivial to add,<br> > > if we really need those VOPs.<br> > For NFSv4.2 (which I suspect was modelled after what Linux does) the<br> > difference is the ability to complete the entire "copy" within 1-2sec under<br> > normal circumstances.<br> > --> The NFSv4.2 CLONE operation requires this.<br> > whereas for the NFSv4.2 COPY<br> > --> It is allowed to return after a partial completion to adhere to the 1-2sec<br> > rule. This probably does not affect ZFS, but it is needed for<br> > the "in general"<br> > UFS case.<br> ><br> > There may be no difference needed for zfs_copy_file_range(). So long as it<br> > never returns after a partial completion. If it does return after<br> > partial completion,<br> > a flag would indicate "must complete it".<br> ><br> > As for FreeBSD syscalls, I don't see a need for a new one.<br> > I'll leave that up to others.<br> > pathconf(2) could be used to determine if cloning is supported.<br> ><br> > Thanks for all the comments. It looks like a new "kernel only" flag for<br> > VOP_COPY_FILE_RANGE() and a new name for VOP_PATHCONF()<br> > should be all that is needed.<br> So, this seems almost too easy?<br> <br> What I am thinking of (and should be easy to do in the next few days<br> for 15.0) is:<br> - Define a new pathconf variable _PC_CLONE_BLKSIZE which returns<br> the blksize for cloning or 0 if cloning is not supported.<br> - Define a new flag for copy_file_range() called COPY_FILE_RANGE_CLONE<br> which, if set, would require that the entire copy be completed via cloning<br> (no partial copy allowed) or return ENOSYS if the file system does not<br> support this.<br> Expose this flag to userland in case any application really needs cloning.<br> The code changes outside of NFS are trivial.<br> <br> So, how does this sound? ric</blockquote><div><br></div><div>Yes, I think that would work. </div></div></div>home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2ha6pR7=zZKs9PttetRORA6OT1ywNFLVXVGvQ1hUH1OgA>
