Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Apr 2024 16:44:54 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Rick Macklem <rick.macklem@gmail.com>
Cc:        Alan Somers <asomers@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: SEEK_HOLE at EOF
Message-ID:  <CANCZdfozg-yJjiyAVwig8bxLUzV1vAXUkipNHxXmc=0GhozuyQ@mail.gmail.com>
In-Reply-To: <CAM5tNy5btZGYz3Ya-8qFObycdmyWZEnuAOHquW6FNWjcL8_DuA@mail.gmail.com>
References:  <CAOtMX2gaHkH7gRT1OWTNpZEcr13%2BiozicmUDZ1hEapT6oiXiuQ@mail.gmail.com> <CAM5tNy7o%2BEpuFFfZ_4fEMmzDLydC6PkhgtcDjQ5mgufb5_7TVg@mail.gmail.com> <CAOtMX2giiOx5vTkUujU29JsbY8O2EqcMRDvtOQYQNbCfZZPLjg@mail.gmail.com> <CAM5tNy5btZGYz3Ya-8qFObycdmyWZEnuAOHquW6FNWjcL8_DuA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000017e07c06154d17b1
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Thu, Apr 4, 2024 at 3:39=E2=80=AFPM Rick Macklem <rick.macklem@gmail.com=
> wrote:

> On Thu, Apr 4, 2024 at 1:59=E2=80=AFPM Alan Somers <asomers@freebsd.org> =
wrote:
> >
> > On Thu, Apr 4, 2024 at 2:56=E2=80=AFPM Rick Macklem <rick.macklem@gmail=
.com>
> wrote:
> > >
> > > On Thu, Apr 4, 2024 at 11:15=E2=80=AFAM Alan Somers <asomers@freebsd.=
org>
> wrote:
> > > >
> > > > tldr; there are two problems:
> > > > 1) tmpfs handles SEEK_HOLE differently than other file systems
> > > > 2) everything else handles SEEK_HOLE at EOF poorly, IMHO
> > > >
> > > > Details:
> > > >
> > > > According to lseek(2), SEEK_HOLE should return the start of the nex=
t
> > > > hole greater than or equal to the supplied offset.  Also, each file
> > > > has a zero-sized virtual hole at the very end of the file.  So I
> would
> > > > expect that calling SEEK_HOLE at EOF would return the file's size.
> > > > However, the man page also says that SEEK_HOLE will return ENXIO wh=
en
> > > > the offset points to EOF.  Those two statements seem contradictory =
to
> > > > me.  The first behavior seems more logical.  I would expect SEEK_HO=
LE
> > > > to work the same way both at EOF and at any other file offset.
> > > >
> > > > What does the spec say?
> > > >
> > > > There is no POSIX standard for this.  It was invented by Solaris,
> > > > Illumos's man page does not say clearly say what should happen at
> EOF.
> > > > Linux's man page is clear: "whence is SEEK_DATA or SEEK_HOLE, and
> > > > offset is beyond the end of the file".  That would seem to indicate
> > > > behavior 1: SEEK_HOLE should return the file's size at EOF.  Only
> > > > beyond EOF should it return ENXIO.
> > > Well, there is the Austin Group stuff (never ratified by POSIX as I
> > > understand it).
> > >
> > > Here's what it says about SEEK_HOLE and offset:
> > > If whence is SEEK_HOLE, the file offset shall be set to the smallest
> > > location of a byte within a hole and not less than offset, except tha=
t
> > > if offset falls within the last hole, then the file offset may be set
> > > to the file size instead. It shall be an error if offset is greater
> > > or equal to the size of the file.
> > >
> > > I'd suggest we follow this, since it is the closest to a standard tha=
t
> there is.
> >
> > That sounds like behavior 2: return ENXIO at EOF.  For reference, do
> > you have a link to that somewhere?
> 0000415: add SEEK_HOLE, SEEK_DATA to lseek - Austin Group Defect
> Tracker (austingroupbugs.net)
> If this doesn't give you a link (gmail never shows the raw url for me)
> just google
> "SEEK_HOLE austin group".
>

You have to join the mailing list to have access. It's easy to do. You can
then download the latest draft (which I think is the ballot draft, so will
be quite close to final, usually just 'typos' and such are corrected before
the published standard).This will be the next POSIX.1 standard, likely this
year.

So it's kinda hard to give an exact link :(.

Warner

--00000000000017e07c06154d17b1
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">=
<div dir=3D"ltr" class=3D"gmail_attr">On Thu, Apr 4, 2024 at 3:39=E2=80=AFP=
M Rick Macklem &lt;<a href=3D"mailto:rick.macklem@gmail.com">rick.macklem@g=
mail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex">On Thu, Apr 4, 2024 at 1:59=E2=80=AFPM Alan Somers &lt;<a href=3D"m=
ailto:asomers@freebsd.org" target=3D"_blank">asomers@freebsd.org</a>&gt; wr=
ote:<br>
&gt;<br>
&gt; On Thu, Apr 4, 2024 at 2:56=E2=80=AFPM Rick Macklem &lt;<a href=3D"mai=
lto:rick.macklem@gmail.com" target=3D"_blank">rick.macklem@gmail.com</a>&gt=
; wrote:<br>
&gt; &gt;<br>
&gt; &gt; On Thu, Apr 4, 2024 at 11:15=E2=80=AFAM Alan Somers &lt;<a href=
=3D"mailto:asomers@freebsd.org" target=3D"_blank">asomers@freebsd.org</a>&g=
t; wrote:<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; tldr; there are two problems:<br>
&gt; &gt; &gt; 1) tmpfs handles SEEK_HOLE differently than other file syste=
ms<br>
&gt; &gt; &gt; 2) everything else handles SEEK_HOLE at EOF poorly, IMHO<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; Details:<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; According to lseek(2), SEEK_HOLE should return the start of =
the next<br>
&gt; &gt; &gt; hole greater than or equal to the supplied offset.=C2=A0 Als=
o, each file<br>
&gt; &gt; &gt; has a zero-sized virtual hole at the very end of the file.=
=C2=A0 So I would<br>
&gt; &gt; &gt; expect that calling SEEK_HOLE at EOF would return the file&#=
39;s size.<br>
&gt; &gt; &gt; However, the man page also says that SEEK_HOLE will return E=
NXIO when<br>
&gt; &gt; &gt; the offset points to EOF.=C2=A0 Those two statements seem co=
ntradictory to<br>
&gt; &gt; &gt; me.=C2=A0 The first behavior seems more logical.=C2=A0 I wou=
ld expect SEEK_HOLE<br>
&gt; &gt; &gt; to work the same way both at EOF and at any other file offse=
t.<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; What does the spec say?<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; There is no POSIX standard for this.=C2=A0 It was invented b=
y Solaris,<br>
&gt; &gt; &gt; Illumos&#39;s man page does not say clearly say what should =
happen at EOF.<br>
&gt; &gt; &gt; Linux&#39;s man page is clear: &quot;whence is SEEK_DATA or =
SEEK_HOLE, and<br>
&gt; &gt; &gt; offset is beyond the end of the file&quot;.=C2=A0 That would=
 seem to indicate<br>
&gt; &gt; &gt; behavior 1: SEEK_HOLE should return the file&#39;s size at E=
OF.=C2=A0 Only<br>
&gt; &gt; &gt; beyond EOF should it return ENXIO.<br>
&gt; &gt; Well, there is the Austin Group stuff (never ratified by POSIX as=
 I<br>
&gt; &gt; understand it).<br>
&gt; &gt;<br>
&gt; &gt; Here&#39;s what it says about SEEK_HOLE and offset:<br>
&gt; &gt; If whence is SEEK_HOLE, the file offset shall be set to the small=
est<br>
&gt; &gt; location of a byte within a hole and not less than offset, except=
 that<br>
&gt; &gt; if offset falls within the last hole, then the file offset may be=
 set<br>
&gt; &gt; to the file size instead. It shall be an error if offset is great=
er<br>
&gt; &gt; or equal to the size of the file.<br>
&gt; &gt;<br>
&gt; &gt; I&#39;d suggest we follow this, since it is the closest to a stan=
dard that there is.<br>
&gt;<br>
&gt; That sounds like behavior 2: return ENXIO at EOF.=C2=A0 For reference,=
 do<br>
&gt; you have a link to that somewhere?<br>
0000415: add SEEK_HOLE, SEEK_DATA to lseek - Austin Group Defect<br>
Tracker (<a href=3D"http://austingroupbugs.net" rel=3D"noreferrer" target=
=3D"_blank">austingroupbugs.net</a>)<br>
If this doesn&#39;t give you a link (gmail never shows the raw url for me)<=
br>
just google<br>
&quot;SEEK_HOLE austin group&quot;.<br></blockquote><div><br></div><div>You=
 have to join the mailing list to have access. It&#39;s easy to do. You can=
 then download the latest draft (which I think is the ballot draft, so will=
 be quite close to final, usually just &#39;typos&#39; and such are correct=
ed before the published standard).This will be the next POSIX.1 standard, l=
ikely this year.=C2=A0</div><div><br></div><div>So it&#39;s kinda hard to g=
ive an exact link :(.</div><div><br></div><div>Warner<br></div></div></div>

--00000000000017e07c06154d17b1--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfozg-yJjiyAVwig8bxLUzV1vAXUkipNHxXmc=0GhozuyQ>