From nobody Mon Nov 10 10:10:05 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4d4lm3258Dz6GVtw for ; Mon, 10 Nov 2025 10:10:27 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4d4lm23v4kz3PMr for ; Mon, 10 Nov 2025 10:10:26 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-64162c04f90so3065148a12.0 for ; Mon, 10 Nov 2025 02:10:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762769419; x=1763374219; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=J5GjgYfHNC0ojz9wHIDhEizxht026DK3K0m5LYqiVWw=; b=aEPLLJbytVK+oweopAN8WX3SWk4U/EoVrmH7fpiFqPPZ2ji7IsOOAZ1XHZ3/i6xMLC uv90Tw1wsBWu5+RCCTq10eWRAv9Iky7IcFRv3GY0iXI4n1PHNzXQ4mmqMxXyQOrRwVIY kXZLniSSM/apuvMsCmtkYFE/j398DtZsnxv8oE0rUAbSTMgOI774dVPhH4VrwsUeNxxu yOCgGvawlC5Y94c99q0hPnBsc7qN5u1OACtOpABulWE73n6wWFeD87HlljpzjBC+/OGf g3JRJiMEyLoJy8gnGCUtndB/9lCGFPWMGQtg4+uzM0UD4M8bFkIbL13Kwj/BCxx7SVwZ u6MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762769419; x=1763374219; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=J5GjgYfHNC0ojz9wHIDhEizxht026DK3K0m5LYqiVWw=; b=uztGNEHoYTL4RtNZYhX9JljPyC12YKX5ID/PRFJYrbosWYQ8njREM6XyFL5PffiGPZ 8dJw3FVj6M4btr7C8YDvrfWvryZmcJfGODCCLBgXlTmk2BKsR8tuCnqW0iRexKX3W7jM PmSAkbgY3mDzKOsew14pLJnsBBKw4a63r+Xb+O33JxywfJ4YeUJ570kgorMNdrkqyKe/ cyEUu3qj2hOLWrIr/DA864Hlu/bQKFysK6K9iNR1aw1DAMnuRp5uGt9D8ypYLa6TpTGP 60Ysf+JR098qZlg2MxoKyvUQoCpTamoLSCsKUO9vs45vN3X0f6S/yQJ+TF0QPwO3uSJX 5wdg== X-Forwarded-Encrypted: i=1; AJvYcCWg6dpoT/83QT2kof819Y4Yu61/kGUWDNU5fZDhn02OkmLEx9TUsTUCIGi+WBWR0HprWx3QTVwNQVm0yiq5JFM=@freebsd.org X-Gm-Message-State: AOJu0YzomFNkNPMcmbNhnk07a+4yZ9g6Gs8N/W+oRrsCn+/ovOzcY4Zu tfpxeH7CX6SUb1j1sx/RsY/b9h6xlH/QQl/Xq7SA5CIKIhRZYeU1tZFQc0vdW8gr7MxMFdSoUSn 0vHdxkTT/AkhiAr4dVoPY0csx7m326Q== X-Gm-Gg: ASbGnctM1w9x0gvUjlXADC++zkACN/dX2Fush2yk/eZD8Nr7MrKGS1BXbLS56ct4xFD nXlgQ3eGbZ3FAuZ8ZnLkLS0Fkl13VQnZkZDVK7okm0vdBVqEOxxWMrrVahmsWEccw1HLM5TIcUC epgH/NIVlyy7PfboVnG4TOIprpeiD9Vza8n3lUlN2pMbPJaxBIhwiOLn6HXkwGNpiOYTJXM99cO VDQXK2wnxA9cL6suAqa9rCp722B9sXeZu+8uIfzcqZCK6dN8ma09mJiF1E4bw== X-Google-Smtp-Source: AGHT+IEmdblHsR452LP/g4dHI8rKfA68IzxFXmA6UdhwdYYIXF79oeBQVdfYBiXaiGha6e1/jYaqqXirDmo6peFN0xM= X-Received: by 2002:a05:6402:4559:b0:640:9db5:ba2f with SMTP id 4fb4d7f45d1cf-6415e82b1f4mr5746200a12.30.1762769418647; Mon, 10 Nov 2025 02:10:18 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: <2100145914.14642.1762672441817@localhost> In-Reply-To: From: Rick Macklem Date: Mon, 10 Nov 2025 02:10:05 -0800 X-Gm-Features: AWmQ_bnUSKrRWtlLO0kfFsWhrGT-_79JiEOofq7dejWuOzrVcWUcocHyZ1ZkMY8 Message-ID: Subject: Re: RFC: Should copy_file_range(2) return after a few seconds? To: Bakul Shah Cc: Ronald Klop , "Peter 'PMc' Much" , FreeBSD CURRENT Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Queue-Id: 4d4lm23v4kz3PMr On Mon, Nov 10, 2025 at 12:58=E2=80=AFAM Bakul Shah w= rote: > > On Nov 9, 2025, at 12:52=E2=80=AFAM, Rick Macklem wrote: > > > > On Sat, Nov 8, 2025 at 11:14=E2=80=AFPM Ronald Klop wrote: > >> > >> Why is this locking needed? > >> AFAIK Unix has advisory locking, so if you read a file somebody else i= s writing the result is your own problem. It is up to the applications to a= dhere to the locking. > >> Is this a lock different than file locking from user space? > > Yes. A rangelock is used for a byte range during a read(2) or > > write(2) to ensure that they are serialized. This is a POSIX > > requirement. (See this post by kib@ in the original email > > discussion. https://lists.freebsd.org/archives/freebsd-fs/2025-October/= 004704.html) > > > > Since there is no POSIX standard for copy_file_range(), it could > > be argued that range locking isn't required for copy_file_range(), > > but that makes it inconsistent with read(2)/write(2) behaviour. > > (I, personally, am more comfortable with a return after N sec > > than removing the range locking, but that's just my opinion.) > > Traditionally reads/writes on Unix were atomic but that is not the > case for NFS, right? That is, while I am reading a file over NFS > someone else can modify it from another host (if they have write > permission). That is, AFAIK, the POSIX atomicity requirement for > ead / write is broken by NFS except for another reader/writer on > the same host. > > Another issue is that a kernel lock that is held for a very very > long time is asking for trouble. Ideally one spends as little time > as possible in the supervisor state and any optimization hacks > that push logic into the kernel should strive to not hold locks > for very long so that things don't grind to a complete halt. > > That is, copy_file_range() use in cat(1) seems excessive. The only > reason for its use seems to be for improving performance. Why not > break it up in smaller chunks? Btw, the time limit I proposed does break it up into smaller chunks. The difference is that it specifies a chunk size that can be copied in K seconds instead of a chunk size of N Kbytes. (The problem with using N Kbytes is there is no way to know what N should be.) rick > That way you still get the benefit > of reducing syscall overhead (which pales in comparision to any > network reads in any case) + the same skipping over holes. Small > reads/wries is what we did before this syscall raised its head!