From nobody Thu Aug 7 14:32:10 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4byV4J3vThz64Lkg for ; Thu, 07 Aug 2025 14:32:32 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4byV4H3bQlz49pP; Thu, 07 Aug 2025 14:32:31 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b="SS/Tw40Y"; spf=pass (mx1.freebsd.org: domain of rick.macklem@gmail.com designates 2a00:1450:4864:20::52d as permitted sender) smtp.mailfrom=rick.macklem@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ed1-x52d.google.com with SMTP id 4fb4d7f45d1cf-617c40825c9so1373837a12.2; Thu, 07 Aug 2025 07:32:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1754577143; x=1755181943; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3o11+x12Fy+w1TlQLyLbTCVl8n6HuUvlAUo2JbwzgR4=; b=SS/Tw40Yc/XzW/4myN2y8O1x9SFcFW/GVwkcuyUoPB+Orf2mixKoJBQK7pruIeVd+z qRY0w93njB4b7bcgxiAqIJcAgyGaNuARr30bY+4cD+5bePElw2ZhRc+GRBNfCulwLXSj cRapJVRemaJdlxDERg9rqReT2lB1A6gK6BNLK/0oOpHWYMjUdf/kvA/Amqc6HtWRo9rx aSefaghPp/g/rgtTkXfo/+hFxrhq8xh+EjakT5FlQu2B5hLIbZOnYNn83EQGiknht2xz sBr+3izIVV8bb1Oq1kSOj/CNWkmswC4f54xg62Iy3iGIz7imAQ6Z3/wh1nkHQS0rRBty +eGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754577143; x=1755181943; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3o11+x12Fy+w1TlQLyLbTCVl8n6HuUvlAUo2JbwzgR4=; b=I3Ykm9XaT9KZUp2mLtSHvTvtrg9vFAPPByWjEBL4wSV1kDJ2rhD/eSmfsgw2UDoGUq 1tlt663+selOkLB8eEyL2mhHX3+Gb5lTVrXvdz+Ldc5tp9dWjBKYJuuyYd0FkCNd/uDV WfvMCSuEF1tCYao6ln4N8RH9TVzM8XQsFe+9O3m8MiFZQ5UFYyV9YaO70bgUQ303/nO2 4qOCQACTQqQQ1r//Vre3Q212zOQELH24K9B1Nwhr7zqfeSyHoaxao5JpQgWTI4fimF5V lQSvR90WuP59sGbXFaR7Jks/DmfyWcCgv8RyPjPTEoEC3kOLHHqWUP5V7keMk+uN1jqK Znjw== X-Forwarded-Encrypted: i=1; AJvYcCWrMANAFCkUCcO8KlYWvXxlYlo3JWSpKt7QiCxl65V09YWqSei4u/T0g0Jn1z6/HN6qt8d21AY1M5bFDNZN4YA=@freebsd.org X-Gm-Message-State: AOJu0YxIKgRFOXBMkqJZ/gvUte78l5LrF5SoUlwkYRYBSNqiAhzLBDzL XRKCfcgy2bKjIrSUav6IsG1gGZ7qmisIxPPvwFdiJYcKj6sZL9+UI+Mu5et+o+BNGU+AHDxpLsh klqwodvFaELxKodXpRxJgO3cBITdM8aNJ728= X-Gm-Gg: ASbGncviQ+CdCWVYcYXH0z/x9UCTO5hEPR66Z2Da5wHwuNhYeCD3fRXa129R/5utT3P IlyPrSnHqgmJCWrWmaSMGtoOSh+42x4QslAHT7KG3+okNPzd2sbv/hG0uCzYKai52Kz/VEqsBPm H4UWJ1LLwvvGpWebCjJQyPaLSoe0zg85mggJPnL9QZ3kt2D8jnWzRrUWZ0mb4P+QW5viqp9E5dK HCx7PDqoLUg2u2+70EcaNXj5I9Bblfkg3O8t00= X-Google-Smtp-Source: AGHT+IHT3xvsGtVLvqesAcFnvR3Dj2GF5E40v/yzfUg7pXIG/xrgQYfhPQZeaY9/NC2dvxTzQ04lLVOQhWH+6eZJQJA= X-Received: by 2002:a05:6402:50c9:b0:615:ad47:58c6 with SMTP id 4fb4d7f45d1cf-61797e50670mr6373771a12.30.1754577143195; Thu, 07 Aug 2025 07:32:23 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: <8925b735-8398-4e0f-95f7-8d1115413013@FreeBSD.org> In-Reply-To: From: Rick Macklem Date: Thu, 7 Aug 2025 07:32:10 -0700 X-Gm-Features: Ac12FXxR4xQikqeSLQl0scCpmFcFIvtijGXp7kRzxh7DQ-QoKEFgFlIJT2AhUGI Message-ID: Subject: Re: RFC: Does ZFS block cloning do this? To: Alexander Motin Cc: Alan Somers , FreeBSD CURRENT Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Result: default: False [-2.88 / 15.00]; SUBJECT_ENDS_QUESTION(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.88)[-0.882]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FREEMAIL_FROM(0.00)[gmail.com]; TAGGED_FROM(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::52d:from]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MID_RHS_MATCH_FROMTLD(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; MISSING_XM_UA(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-Rspamd-Queue-Id: 4byV4H3bQlz49pP X-Spamd-Bar: -- On Wed, Aug 6, 2025 at 9:46=E2=80=AFAM Rick Macklem wrote: > > On Wed, Aug 6, 2025 at 9:28=E2=80=AFAM Alexander Motin = wrote: > > > > Hi Rick, > > > > On 8/6/25 11:54, Rick Macklem wrote: > > > The difference for NFSv4.2 is that CLONE cannot return with partial c= ompletion. > > > (It assumes that a CLONE of any size will complete quickly enough for= an RPC. > > > Although there is no fixed limit, most assume an RPC reply should hap= pen in > > > 1-2sec at most. For COPY, the server can return with only part of the > > > copy done.) > > > It also includes alignment restrictions for the byte offsets. > > > > > > There is also the alignment restriction on CLONE. There doesn't seem = to be > > > an alignment restriction on zfs_clone_range(), but maybe it is buried= inside it? > > > I think adding yet another pathconf name to get the alignment require= ment and > > > whether or not the file system supports it would work without any VOP= change. > > > > The semantics you describe looks similar to Linux FICLONE/FICLONERANGE > > calls, that got adopted there before copy_file_range(). IIRC those > > effectively mean -- clone the file or its range as requested or fail. = I > > am not sure why some people prefer those calls, explicitly not allowing > > fallback to copy, but theere are some, for example Veeam backup fails i= f > > ZFS rejects the cloning request for any reason. For Linux ZFS has a > > separate code (see zpl_remap_file_range() and respective VFS calls) > > wrapping around block cloning to implement this semantics. FreeBSD doe= s > > not have the equivalent at this point, but it would be trivial to add, > > if we really need those VOPs. > For NFSv4.2 (which I suspect was modelled after what Linux does) the > difference is the ability to complete the entire "copy" within 1-2sec und= er > normal circumstances. > --> The NFSv4.2 CLONE operation requires this. > whereas for the NFSv4.2 COPY > --> It is allowed to return after a partial completion to adhere to the 1= -2sec > rule. This probably does not affect ZFS, but it is needed for > the "in general" > UFS case. > > There may be no difference needed for zfs_copy_file_range(). So long as i= t > never returns after a partial completion. If it does return after > partial completion, > a flag would indicate "must complete it". > > As for FreeBSD syscalls, I don't see a need for a new one. > I'll leave that up to others. > pathconf(2) could be used to determine if cloning is supported. > > Thanks for all the comments. It looks like a new "kernel only" flag for > VOP_COPY_FILE_RANGE() and a new name for VOP_PATHCONF() > should be all that is needed. So, this seems almost too easy? What I am thinking of (and should be easy to do in the next few days for 15.0) is: - Define a new pathconf variable _PC_CLONE_BLKSIZE which returns the blksize for cloning or 0 if cloning is not supported. - Define a new flag for copy_file_range() called COPY_FILE_RANGE_CLONE which, if set, would require that the entire copy be completed via clonin= g (no partial copy allowed) or return ENOSYS if the file system does not support this. Expose this flag to userland in case any application really needs cloning= . The code changes outside of NFS are trivial. So, how does this sound? rick > > rick > > > > > -- > > Alexander Motin