bsd.org>
List-Unsubscribe: <mailto:freebsd-fs+unsubscribe@freebsd.org>
Sender: owner-freebsd-fs@FreeBSD.org
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.100.1.1.5\))
Subject: Re: Why does rangelock_enqueue() hang for hours?
From: Bakul Shah <bakul@iitbombay.org>
In-Reply-To: <CAM5tNy6n5QJpRpkCJFHK=_5dWT=EZqkjztAfZHdqREKYSby55g@mail.gmail.com>
Date: Tue, 21 Oct 2025 07:49:55 -0700
Cc: Peter 'PMc' Much <pmc@citylink.dinoex.sub.org>,
 freebsd-fs@freebsd.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <BC3B5D56-7DAD-4090-A25F-91640CFEE28D@iitbombay.org>
References: <aPeEVSamcdKSoF5N@disp.intra.daemon.contact>
 <CAM5tNy6n5QJpRpkCJFHK=_5dWT=EZqkjztAfZHdqREKYSby55g@mail.gmail.com>
To: Rick Macklem <rick.macklem@gmail.com>
X-Mailer: Apple Mail (2.3864.100.1.1.5)
X-Spamd-Bar: --
X-Spamd-Result: default: False [-2.49 / 15.00];
	SUBJECT_ENDS_QUESTION(1.00)[];
	NEURAL_HAM_MEDIUM(-1.00)[-1.000];
	NEURAL_HAM_LONG(-1.00)[-1.000];
	NEURAL_HAM_SHORT(-0.99)[-0.992];
	DMARC_POLICY_ALLOW(-0.50)[iitbombay.org,quarantine];
	MV_CASE(0.50)[];
	R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36];
	R_DKIM_ALLOW(-0.20)[iitbombay.org:s=google];
	MIME_GOOD(-0.10)[text/plain];
	DKIM_TRACE(0.00)[iitbombay.org:+];
	RCPT_COUNT_THREE(0.00)[3];
	FREEMAIL_TO(0.00)[gmail.com];
	MIME_TRACE(0.00)[0:+];
	RCVD_TLS_LAST(0.00)[];
	ARC_NA(0.00)[];
	FREEFALL_USER(0.00)[bakul];
	FROM_HAS_DN(0.00)[];
	TO_DN_SOME(0.00)[];
	TO_MATCH_ENVRCPT_SOME(0.00)[];
	RCVD_COUNT_TWO(0.00)[2];
	FROM_EQ_ENVFROM(0.00)[];
	RCVD_VIA_SMTP_AUTH(0.00)[];
	PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org];
	MID_RHS_MATCH_FROM(0.00)[];
	MLMMJ_DEST(0.00)[freebsd-fs@freebsd.org];
	TAGGED_RCPT(0.00)[];
	RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::636:from]
X-Rspamd-Queue-Id: 4crZw55qblz3KSZ

I didn't read this thread before commenting on the forum where Peter
first raised this issue. Adding the relevant part of my comment here:
+---
By git blame cat.c we find it was added on 2023-07-08 in commit =
8113cc8276.
git log 8113cc8276 says
  cat: use copy_file_range(2) with fallback to previous behavior

  This allows to use special filesystem features like server-side
  copying on NFS 4.2 or block cloning on OpenZFS 2.2.

May be it should check that these conditions are met? That is, both =
files should be
remote or both files should be local for it to be really worth it. In =
any case IMHO
this should not be the default behavior. Still, it should not hang....
+---


> On Oct 21, 2025, at 6:28=E2=80=AFAM, Rick Macklem =
<rick.macklem@gmail.com> wrote:
>=20
> On Tue, Oct 21, 2025 at 6:09=E2=80=AFAM Peter 'PMc' Much
> <pmc@citylink.dinoex.sub.org> wrote:
>>=20
>>=20
>> This is 14.3-RELEASE.
>>=20
>> I am copying a file from a NFSv4 share to a local filesystem. This
>> takes a couple of hours.
>>=20
>> In the meantime I want to read that partially copied file. This is
>> not possible. The reading process locks up in rangelock_enqueue(),
>> unkillable(!), and only after the first slow copy has completed, it
>> will do it's job.
>>=20
>> Even if I do the first copy to stdout with redirect to file, the
>> same problem happens. I.e.:
>>=20
>> $ cat /nfsshare/File > /localfs/File &
>> $ cat /localfs/File  --> HANGS unkillable
> This is caused by "cat" using copy_file_range(2), where the
> system call is taking a long time.
>=20
> The version done below makes "cat" not use copy_file_range(2).
> (copy_file_range(2) is interruptible, but that stops the file copy.
> It also has a "return after 1sec" option.
> Maybe that option should be exposed to userland and used by
> "cat", "cp" and friends at least when enabled by a command line
> option. (I'll admit looking at a file while it is being copied is a =
bit odd?)
> The whole idea behind range-lock is to prevent a read/write syscall
> from seeing a partial write. It just happens that the "write" takes a =
long
> time in this case.
>=20
> Do others have thoughts on this? rick
>=20
>>=20
>> Only if I introduce another process, the tie is avoided:
>>=20
>> $ cat /nfsshare/File | cat > /localfs/File &
>> $ cat /localfs/File  --> WORKS
>>=20
>> I very much doubt that this is how it should be.
>>=20
>> Also, if I try to get some information about the supposed operation
>> of this "rangelock" feature, search engines point me to a
>> "rangelock(9)" manpage on man.freebsd.org, but that page doesn't
>> seem to exist. :(
>>=20
>=20