From nobody Thu Apr 4 18:14:45 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4V9VC72Z9tz5FkB1 for ; Thu, 4 Apr 2024 18:14:59 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ua1-f44.google.com (mail-ua1-f44.google.com [209.85.222.44]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4V9VC63Gssz4pbL for ; Thu, 4 Apr 2024 18:14:58 +0000 (UTC) (envelope-from asomers@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=freebsd.org (policy=none); spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.222.44 as permitted sender) smtp.mailfrom=asomers@gmail.com Received: by mail-ua1-f44.google.com with SMTP id a1e0cc1a2514c-7e389d74dcaso717227241.0 for ; Thu, 04 Apr 2024 11:14:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712254497; x=1712859297; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Xrx6WqUUlO4uJ1JrSOcgEiGvmQDmBLpmbULANJpqIaQ=; b=listoN7RfVUcEn1WcDQq6fEEYs2hgYNiK0e5zkYFGWhHHmXKWqDLoci4UlpbVZh/+r GN2xsdpcYyhhvGhJYkKy1oQCDDMNqONanoiqxpGD/FwXWJLJstxSr1GTbUtDKEd0xXmV GnLGCl+VmwDucE36BqFCcQkfISp2G696DsnYYS3cZTjQnlFPgpwe31WwxSH8AsejBGhR tKA+Fu6BcRJLaCjhzcPdCAXYKF4H9DFe+d5NwKoeqvg83+qgRjeWAIThjNY1Gc5GTDDK RNXtP5OmoiSqM0rZ/0XaPiLq5PVf5rjAPjmuhJASndrw7pxlusN3fsmtcNaNNjN7sz0V HgRA== X-Gm-Message-State: AOJu0YzIn9FKzPdNUt0WQCMZXGUZiB9aPYo41HuAfXc84bI8mtsSSgzo O2NdyOprnuC28Ya8npFPzFJ+l2cqmrulSE2fC3pZaDYfL0N1nDJns568mkRaajjoAcwyD4xFTcc 7znQ3m11PLQalRIcaXsHJkA3l6cY7+/ZhF/s= X-Google-Smtp-Source: AGHT+IFhk0OK163c0nvCPs9ytqPQkLLH83Kz/iQmda5ncv15XvrFuqdswHhQ/8IGkb7nNx7gXA5QZ/vZ8a3CN45oWME= X-Received: by 2002:a67:ebc5:0:b0:479:c133:a743 with SMTP id y5-20020a67ebc5000000b00479c133a743mr379937vso.9.1712254496868; Thu, 04 Apr 2024 11:14:56 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 From: Alan Somers Date: Thu, 4 Apr 2024 12:14:45 -0600 Message-ID: Subject: SEEK_HOLE at EOF To: FreeBSD Hackers Content-Type: text/plain; charset="UTF-8" X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.52 / 15.00]; NEURAL_HAM_LONG(-0.99)[-0.987]; NEURAL_HAM_SHORT(-0.98)[-0.982]; NEURAL_HAM_MEDIUM(-0.65)[-0.648]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; MIME_GOOD(-0.10)[text/plain]; DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : SPF not aligned (relaxed), No valid DKIM,none]; TO_DN_ALL(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; RCPT_COUNT_ONE(0.00)[1]; MISSING_XM_UA(0.00)[]; FREEFALL_USER(0.00)[asomers]; FREEMAIL_ENVFROM(0.00)[gmail.com]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; RCVD_COUNT_ONE(0.00)[1]; R_DKIM_NA(0.00)[]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.222.44:from]; TO_DOM_EQ_FROM_DOM(0.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; RCVD_IN_DNSWL_NONE(0.00)[209.85.222.44:from] X-Rspamd-Queue-Id: 4V9VC63Gssz4pbL tldr; there are two problems: 1) tmpfs handles SEEK_HOLE differently than other file systems 2) everything else handles SEEK_HOLE at EOF poorly, IMHO Details: According to lseek(2), SEEK_HOLE should return the start of the next hole greater than or equal to the supplied offset. Also, each file has a zero-sized virtual hole at the very end of the file. So I would expect that calling SEEK_HOLE at EOF would return the file's size. However, the man page also says that SEEK_HOLE will return ENXIO when the offset points to EOF. Those two statements seem contradictory to me. The first behavior seems more logical. I would expect SEEK_HOLE to work the same way both at EOF and at any other file offset. What does the spec say? There is no POSIX standard for this. It was invented by Solaris, Illumos's man page does not say clearly say what should happen at EOF. Linux's man page is clear: "whence is SEEK_DATA or SEEK_HOLE, and offset is beyond the end of the file". That would seem to indicate behavior 1: SEEK_HOLE should return the file's size at EOF. Only beyond EOF should it return ENXIO. But what do other implementations do? Contrary to its man page, Linux behaves mostly like FreeBSD. SEEK_HOLE returns ENXIO at EOF on most file systems. I tested a number of file systems on both FreeBSD and Linux. Most of them return ENXIO. The only two outliers are FreeBSD's tmpfs and Linux's NFS client. FreeBSD Linux ======= ========= ===== UFS ENXIO ZFS ENXIO tmpfs file size ENXIO msdosfs ENXIO ENXIO ext2fs ENXIO ENXIO xfs ENXIO tarfs ENXIO nfs ENXIO file size So what should we change? Clearly, it's bad for tmpfs to be inconsistent. My preference would be for everything to behave like tmpfs, but it's currently losing the popularity contest. Anybody else have thoughts? -Alan