From nobody Wed Aug 6 16:20:27 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4bxwWZ0NvSz63wDw for ; Wed, 06 Aug 2025 16:20:42 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4bxwWY2tD5z3LTC for ; Wed, 06 Aug 2025 16:20:41 +0000 (UTC) (envelope-from asomers@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=none; spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=asomers@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=freebsd.org (policy=none) Received: by mail-ed1-f45.google.com with SMTP id 4fb4d7f45d1cf-61571192c3aso65681a12.2 for ; Wed, 06 Aug 2025 09:20:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754497240; x=1755102040; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1NhkQ1s9oaTBlBrFoJXADHU2B1LCbPL6fnVljFO1ge8=; b=ZNzZrX39odjEcu6etxaTw+NJmzlTr74Cx4rpxY9N4BnaqsvbFxJGxHH07wWJ3biTMB sBL7cyWXH53DzW8mJvOdBCh7VrDFlEsIwKY0hO+5OroQT/vZ2BnV3KWNo+TD8W5WA0Ju gocCSC/Eeom7W55AIqIFKNf1Cifb/vlqMgrRmiokUqHSEzMLIIJiO3yElH6Bi+Rq8EnA 4IJlKO7KQMUnQyPfv0oDHI7IYTGE1DBNtyXUM7ZXR/vNn1AWHlK4qJRX7VJTYYS45xit j6BsbbZlz31noGxGPpxmZ5lXuZ4FcjMKNYRjIj85jgeOQTuXz237ItkzecjNi4DtEAFR wdUA== X-Gm-Message-State: AOJu0Yy8g2YoIraV9hxyRMJm+RCwctvRdG3j/3qEFG5btbWnwxh0qD9R CTHTptrIWbxPuUMMuiXXxgSRHu3vw3Acp4YayoJYgGnN36b4niyjwDC3jjQBa4oAyUmGArTn3bF +oUSJsrlJAeSIxLWwF03il9jpnHgMhwa50w== X-Gm-Gg: ASbGncveE97K+7+vwjXM8sHiykF98bYZ6jK487FnXZoaMYflgXaK6dEG0pgoDM9+ELG hKF3yQF/kDbO5Hwbw4evLrPThltQ/+0s08RXC04CYBe6Nvsz6XrfSF2J4vaaj/AGAlrHfiAT5wP BW9zseCPuO/ZcpWKqW2i3gmNDqu9Eci/Zuz1eIGzONJMoT7R+QpYVPwoI0YMAiXunmsr01QqqrJ sTpY70= X-Google-Smtp-Source: AGHT+IFAY0yEgHofZDzxcrmv/Y7e/MXqB6hLesfCSGx5fYOkGzGQ9ezUITITKdjVCsqSMXfic4xrmG4stR8ugfCkRpw= X-Received: by 2002:a05:6402:3111:b0:615:399e:d3ba with SMTP id 4fb4d7f45d1cf-61797e2b407mr1846145a12.32.1754497239614; Wed, 06 Aug 2025 09:20:39 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Wed, 6 Aug 2025 10:20:27 -0600 X-Gm-Features: Ac12FXw-8izLdC3teur5weC3e9Z4baQqa8kn8r0VC8SwBI0pQ9NiB9mbzDu3thw Message-ID: Subject: Re: RFC: Does ZFS block cloning do this? To: Rick Macklem Cc: FreeBSD CURRENT Content-Type: multipart/alternative; boundary="000000000000ba0d37063bb4b8c6" X-Spamd-Result: default: False [0.10 / 15.00]; SUBJECT_ENDS_QUESTION(1.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_SPAM_LONG(1.00)[0.997]; NEURAL_HAM_SHORT(-1.00)[-0.996]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : SPF not aligned (relaxed), No valid DKIM,none]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; FROM_HAS_DN(0.00)[]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; ARC_NA(0.00)[]; FREEMAIL_TO(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+,1:+,2:~]; MISSING_XM_UA(0.00)[]; FREEFALL_USER(0.00)[asomers]; FREEMAIL_ENVFROM(0.00)[gmail.com]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; RCVD_COUNT_ONE(0.00)[1]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.208.45:from]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; TAGGED_RCPT(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[209.85.208.45:from] X-Rspamd-Queue-Id: 4bxwWY2tD5z3LTC X-Spamd-Bar: / --000000000000ba0d37063bb4b8c6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Aug 6, 2025 at 9:54=E2=80=AFAM Rick Macklem wrote: > On Wed, Aug 6, 2025 at 8:32=E2=80=AFAM Alan Somers = wrote: > > > > On Wed, Aug 6, 2025 at 9:18=E2=80=AFAM Rick Macklem > wrote: > >> > >> Hi, > >> > >> NFSv4.2 has a CLONE operation. It is described as doing: > >> The CLONE operation is used to clone file content from a source fil= e > >> specified by the SAVED_FH value into a destination file specified b= y > >> CURRENT_FH without actually copying the data, e.g., by using a > >> copy-on-write mechanism. > >> (It takes arguments for 2 files, with byte offsets and a length.) > >> The offsets must be aligned to a value returned by the NFSv4.2 server. > >> 12.2.1. Attribute 77: clone_blksize > >> > >> The clone_blksize attribute indicates the granularity of a CLONE > >> operation. > >> > >> Does ZFS block cloning do this? > >> > >> I am asking now, because although it might be too late, > >> if the answer is "yes", I'd like to get VOP calls into 15.0 > >> for it. (Hopefully with the VOP calls in place, the rest could > >> go in sometime later, when I find the time to do it.) > >> > >> Thanks in advance for any comments, rick > > > > > > Yes, it does that right now, if the feature@block_cloning pool > attribute is enabled. It works with VOP_COPY_FILE_RANGE. Does NFS reall= y > need a new VOP? > Either a new VOP or maybe a new flag argument for VOP_COPY_FILE_RANGE(). > Linux defined a flag argument for their copy_file_range(), but they have > never > defined any flags. Of course, that doesn't mean there cannot be a > "kernel internal" > flag. > > So maybe adding a new VOP can be avoided. That would be nice, given the > timing > of the 15.0 release and other churn going on. > > The difference for NFSv4.2 is that CLONE cannot return with partial > completion. > (It assumes that a CLONE of any size will complete quickly enough for an > RPC. > Although there is no fixed limit, most assume an RPC reply should happen = in > 1-2sec at most. For COPY, the server can return with only part of the > copy done.) > It also includes alignment restrictions for the byte offsets. > > There is also the alignment restriction on CLONE. There doesn't seem to b= e > an alignment restriction on zfs_clone_range(), but maybe it is buried > inside it? > I think adding yet another pathconf name to get the alignment requirement > and > whether or not the file system supports it would work without any VOP > change. > > rick > zfs_clone_range doesn't have any alignment restrictions. But if the argument isn't aligned to a record boundary, ZFS will actually copy a partial record, rather than clone it. Regarding the copy-to-completion requirement, could that be implemented within nfs by looping over VOP_COPY_FILE_RANGE? --000000000000ba0d37063bb4b8c6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Wed, Aug 6, 2025 at 9:54=E2=80=AFAM Rick M= acklem <rick.macklem@gmail.com= > wrote:
= On Wed, Aug 6, 2025 at 8:32=E2=80=AFAM Alan Somers <asomers@freebsd.org> wrote:
>
> On Wed, Aug 6, 2025 at 9:18=E2=80=AFAM Rick Macklem <rick.macklem@gmail.com>= ; wrote:
>>
>> Hi,
>>
>> NFSv4.2 has a CLONE operation. It is described as doing:
>>=C2=A0 =C2=A0 The CLONE operation is used to clone file content fro= m a source file
>>=C2=A0 =C2=A0 specified by the SAVED_FH value into a destination fi= le specified by
>>=C2=A0 =C2=A0 CURRENT_FH without actually copying the data, e.g., b= y using a
>>=C2=A0 =C2=A0 copy-on-write mechanism.
>> (It takes arguments for 2 files, with byte offsets and a length.)<= br> >> The offsets must be aligned to a value returned by the NFSv4.2 ser= ver.
>> 12.2.1.=C2=A0 Attribute 77: clone_blksize
>>
>>=C2=A0 =C2=A0 The clone_blksize attribute indicates the granularity= of a CLONE
>>=C2=A0 =C2=A0 operation.
>>
>> Does ZFS block cloning do this?
>>
>> I am asking now, because although it might be too late,
>> if the answer is "yes", I'd like to get VOP calls in= to 15.0
>> for it. (Hopefully with the VOP calls in place, the rest could
>> go in sometime later, when I find the time to do it.)
>>
>> Thanks in advance for any comments, rick
>
>
> Yes, it does that right now, if the feature@block_cloning pool attribu= te is enabled.=C2=A0 It works with VOP_COPY_FILE_RANGE.=C2=A0 Does NFS real= ly need a new VOP?
Either a new VOP or maybe a new flag argument for VOP_COPY_FILE_RANGE(). Linux defined a flag argument for their copy_file_range(), but they have ne= ver
defined any flags. Of course, that doesn't mean there cannot be a
"kernel internal"
flag.

So maybe adding a new VOP can be avoided. That would be nice, given the tim= ing
of the 15.0 release and other churn going on.

The difference for NFSv4.2 is that CLONE cannot return with partial complet= ion.
(It assumes that a CLONE of any size will complete quickly enough for an RP= C.
Although there is no fixed limit, most assume an RPC reply should happen in=
1-2sec at most. For COPY, the server can return with only part of the
copy done.)
It also includes alignment restrictions for the byte offsets.

There is also the alignment restriction on CLONE. There doesn't seem to= be
an alignment restriction on zfs_clone_range(), but maybe it is buried insid= e it?
I think adding yet another pathconf name to get the alignment requirement a= nd
whether or not the file system supports it would work without any VOP chang= e.

rick

zfs_clone_range doesn't have a= ny alignment restrictions.=C2=A0 But if the argument isn't aligned to a= record boundary, ZFS will actually copy a partial record, rather than clon= e it.=C2=A0 Regarding the copy-to-completion requirement, could that be imp= lemented within nfs by looping over VOP_COPY_FILE_RANGE?=C2=A0
<= /div> --000000000000ba0d37063bb4b8c6--