ply to one we originated X-Rspamd-Queue-Id: 4d4lm23v4kz3PMr On Mon, Nov 10, 2025 at 12:58=E2=80=AFAM Bakul Shah w= rote: > > On Nov 9, 2025, at 12:52=E2=80=AFAM, Rick Macklem wrote: > > > > On Sat, Nov 8, 2025 at 11:14=E2=80=AFPM Ronald Klop wrote: > >> > >> Why is this locking needed? > >> AFAIK Unix has advisory locking, so if you read a file somebody else i= s writing the result is your own problem. It is up to the applications to a= dhere to the locking. > >> Is this a lock different than file locking from user space? > > Yes. A rangelock is used for a byte range during a read(2) or > > write(2) to ensure that they are serialized. This is a POSIX > > requirement. (See this post by kib@ in the original email > > discussion. https://lists.freebsd.org/archives/freebsd-fs/2025-October/= 004704.html) > > > > Since there is no POSIX standard for copy_file_range(), it could > > be argued that range locking isn't required for copy_file_range(), > > but that makes it inconsistent with read(2)/write(2) behaviour. > > (I, personally, am more comfortable with a return after N sec > > than removing the range locking, but that's just my opinion.) > > Traditionally reads/writes on Unix were atomic but that is not the > case for NFS, right? That is, while I am reading a file over NFS > someone else can modify it from another host (if they have write > permission). That is, AFAIK, the POSIX atomicity requirement for > ead / write is broken by NFS except for another reader/writer on > the same host. > > Another issue is that a kernel lock that is held for a very very > long time is asking for trouble. Ideally one spends as little time > as possible in the supervisor state and any optimization hacks > that push logic into the kernel should strive to not hold locks > for very long so that things don't grind to a complete halt. > > That is, copy_file_range() use in cat(1) seems excessive. The only > reason for its use seems to be for improving performance. Why not > break it up in smaller chunks? Btw, the time limit I proposed does break it up into smaller chunks. The difference is that it specifies a chunk size that can be copied in K seconds instead of a chunk size of N Kbytes. (The problem with using N Kbytes is there is no way to know what N should be.) rick > That way you still get the benefit > of reducing syscall overhead (which pales in comparision to any > network reads in any case) + the same skipping over holes. Small > reads/wries is what we did before this syscall raised its head!