Date: Thu, 4 Apr 2024 13:56:31 -0700 From: Rick Macklem <rick.macklem@gmail.com> To: Alan Somers <asomers@freebsd.org> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: SEEK_HOLE at EOF Message-ID: <CAM5tNy7o%2BEpuFFfZ_4fEMmzDLydC6PkhgtcDjQ5mgufb5_7TVg@mail.gmail.com> In-Reply-To: <CAOtMX2gaHkH7gRT1OWTNpZEcr13%2BiozicmUDZ1hEapT6oiXiuQ@mail.gmail.com> References: <CAOtMX2gaHkH7gRT1OWTNpZEcr13%2BiozicmUDZ1hEapT6oiXiuQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 4, 2024 at 11:15=E2=80=AFAM Alan Somers <asomers@freebsd.org> w= rote: > > tldr; there are two problems: > 1) tmpfs handles SEEK_HOLE differently than other file systems > 2) everything else handles SEEK_HOLE at EOF poorly, IMHO > > Details: > > According to lseek(2), SEEK_HOLE should return the start of the next > hole greater than or equal to the supplied offset. Also, each file > has a zero-sized virtual hole at the very end of the file. So I would > expect that calling SEEK_HOLE at EOF would return the file's size. > However, the man page also says that SEEK_HOLE will return ENXIO when > the offset points to EOF. Those two statements seem contradictory to > me. The first behavior seems more logical. I would expect SEEK_HOLE > to work the same way both at EOF and at any other file offset. > > What does the spec say? > > There is no POSIX standard for this. It was invented by Solaris, > Illumos's man page does not say clearly say what should happen at EOF. > Linux's man page is clear: "whence is SEEK_DATA or SEEK_HOLE, and > offset is beyond the end of the file". That would seem to indicate > behavior 1: SEEK_HOLE should return the file's size at EOF. Only > beyond EOF should it return ENXIO. Well, there is the Austin Group stuff (never ratified by POSIX as I understand it). Here's what it says about SEEK_HOLE and offset: If whence is SEEK_HOLE, the file offset shall be set to the smallest location of a byte within a hole and not less than offset, except that if offset falls within the last hole, then the file offset may be set to the file size instead. It shall be an error if offset is greater or equal to the size of the file. I'd suggest we follow this, since it is the closest to a standard that ther= e is. rick > > But what do other implementations do? > > Contrary to its man page, Linux behaves mostly like FreeBSD. SEEK_HOLE > returns ENXIO at EOF on most file systems. I tested a number of file > systems on both FreeBSD and Linux. Most of them return ENXIO. The > only two outliers are FreeBSD's tmpfs and Linux's NFS client. > > FreeBSD Linux > =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D > UFS ENXIO > ZFS ENXIO > tmpfs file size ENXIO > msdosfs ENXIO ENXIO > ext2fs ENXIO ENXIO > xfs ENXIO > tarfs ENXIO > nfs ENXIO file size > > So what should we change? Clearly, it's bad for tmpfs to be > inconsistent. My preference would be for everything to behave like > tmpfs, but it's currently losing the popularity contest. Anybody else > have thoughts? > > -Alan >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy7o%2BEpuFFfZ_4fEMmzDLydC6PkhgtcDjQ5mgufb5_7TVg>