Date: Fri, 5 Apr 2024 08:13:18 -0600 From: alan somers <asomers@gmail.com> To: Poul-Henning Kamp <phk@phk.freebsd.dk> Cc: Alan Somers <asomers@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: SEEK_HOLE at EOF Message-ID: <CAOtMX2g2VxffUn0jGmc=BtcTP753-ake8nZgqCWXYUKN7JfqrA@mail.gmail.com> In-Reply-To: <202404051354.435Ds1KX086243@critter.freebsd.dk> References: <CAOtMX2gaHkH7gRT1OWTNpZEcr13%2BiozicmUDZ1hEapT6oiXiuQ@mail.gmail.com> <202404050543.4355hDcS009860@critter.freebsd.dk> <CAOtMX2hfxQNrk1iPtq6snYnt0EzK_ffXm5b1TnkTLCYKgW6j3A@mail.gmail.com> <202404051354.435Ds1KX086243@critter.freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 5, 2024 at 7:54=E2=80=AFAM Poul-Henning Kamp <phk@phk.freebsd.d= k> wrote: > > -------- > Alan Somers writes: > > On Thu, Apr 4, 2024 at 11:43=3DE2=3D80=3DAFPM Poul-Henning Kamp <phk@ph= k.freebsd.=3D > > dk> wrote: > > > > Just two minor quibbles: > > > > > > If the file position is EOF, then you /are/ "beyond the end of the fi= le" > > > because a read(2) would not be able to return any data. > > > > Do you distinguish between "at EOF" and "beyond EOF"? And does it not > > trouble you that calling SEEK_HOLE from the beginning of the "virtual > > hole at EOF" will return ENXIO, even though calling SEEK_HOLE from the > > beginning of any real hole will return the current offset? > > EOF is where the file ends and there's no "hole" there, because there > no more file on the other side of that "hole". > > When you stand on a cliff, the ocean is not "a hole in the landscape", > it's where the landscape ends. Except there is a hole at EOF, a virtual hole. The draft spec specifically says "all seekable files shall have a virtual hole starting at the current size of the file". > > > > And returning ENXIO is more informative than returning the size of th= e > > > file, since it atomically tells you that there are no more holes. > > > > Ahh, that's a good point. It's the first point I've heard in favor of > > this option. Are you aware of any applications that need to know > > that? > > No, but that should not get in the way of good syscall architecture :-) > > It might be useful for archivers which try to be smart about sparse files= . I imagine that most archivers would work like this: ofs =3D 0 loop { let start =3D lseek(fd, ofs, SEEK_DATA); if ENXIO { // No more data regions break } let end =3D lseek(fd, ofs, SEEK_HOLE); assert!(!ENXIO) // thanks to the virtual hole, we should never have ENXIO here copy(fd, start, end - start, ...) ofs =3D end } truncate(output_file, fd.fsize) Since archivers really only care about data regions, not holes, I don't think that they would usually call SEEK_HOLE at EOF. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetenc= e.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2g2VxffUn0jGmc=BtcTP753-ake8nZgqCWXYUKN7JfqrA>