Date: Mon, 1 Feb 2016 11:22:18 -0800 From: Maxim Sobolev <sobomax@FreeBSD.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: freebsd-fs@freebsd.org, Kirk McKusick <mckusick@mckusick.com> Subject: Re: Inconsistency between lseek(SEEK_HOLE) and lseek(SEEK_DATA) Message-ID: <CAH7qZftsv_0ersqexJ0fTnSQexe4WvpMLnF6X9bj_wX6q9Ewfw@mail.gmail.com> In-Reply-To: <20160201182257.GN91220@kib.kiev.ua> References: <CAH7qZfuZNZ%2BJDPC4D1sjXj2tFxZKBiYVyTp-Y3UUUoq9er%2BWYQ@mail.gmail.com> <20160201165648.GM91220@kib.kiev.ua> <CAH7qZfvcpBo%2BvDho4GeNYWh6N83sebUi-DSG9--T%2BnxQiLhJ1A@mail.gmail.com> <20160201182257.GN91220@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Well, it's still seems to be quite obscure. At the very least, the lseek(2) manual page needs to reflect that. Right now it says: ERRORS [...] [ENXIO] For SEEK_DATA, there are no more data regions past the supplied offset. For SEEK_HOLE, there are no more holes past the supplied offset. Which is not true, the SEEK_HOLE would return st_size when there are no more holes past the supplied offset, not ENXIO. It is also interesting that somehow empty file is a special case as well. Both SEEK_HOLE and SEEK_DATA return -1 on those. Anybody who programs to that document would probably get as confused as myself. However, having said that, our cousin Linux behaves the same - i.e. returns EOF+1 on SEEK_HOLE and -1 on SEEK_DATA, and does the same for empty files, so at least we are consistent with that. -Max On Mon, Feb 1, 2016 at 10:22 AM, Konstantin Belousov <kostikbel@gmail.com> wrote: > On Mon, Feb 01, 2016 at 09:17:49AM -0800, Maxim Sobolev wrote: > > Here it is: > > > > The expected outcome is return code 0, the failure condition is in the > > lseek() returning 4 (i.e. sizeof(int)), not -1. > > > > ------ > > #include <sys/stat.h> > > #include <sys/types.h> > > #include <fcntl.h> > > #include <stdio.h> > > #include <stdlib.h> > > #include <unistd.h> > > > > int main(void) > > { > > char tempname[] = "/tmp/temp.XXXXXX"; > > char *fname; > > int fd; > > off_t hole; > > > > fname = mktemp(tempname); > > if (fname == NULL) { > > exit (1); > > } > > fd = open(fname, O_WRONLY | O_CREAT | O_TRUNC, DEFFILEMODE); > > if (fd == -1) { > > exit (1); > > } > > if (write(fd, &fd, sizeof(fd)) <= 0) { > > exit (1); > > } > > hole = lseek(fd, 0, SEEK_HOLE); > > close(fd); > > unlink(fname); > > if (hole >= 0) { > > fprintf(stderr, "lseek() returned %jd, not -1\n", > > (intmax_t)hole); > > exit (1); > > } > > exit (0); > > } > > ------ > I tested you program on both UFS and ZFS, and the behaviour is > identical, lseek(SEEK_HOLE) points to the end of file. In fact, when I > did UFS implementation, I most likely considered this case and tested > ZFS compatibility, because the case is handled explicitely. Look at the > lines 2193-2197 in kern/vfs_vnops.c:vn_bmap_seekhole(), esp. the comment. > > For me, the results of the test are reasonable. There is no data > after EOF, and the idea of 'implicit hole' after EOF is one which > is quite intuitive. > > > > > > > On Mon, Feb 1, 2016 at 8:56 AM, Konstantin Belousov <kostikbel@gmail.com > > > > wrote: > > > > > On Mon, Feb 01, 2016 at 07:57:40AM -0800, Maxim Sobolev wrote: > > > > Hi, > > > > > > > > I've noticed that lseek() behaved inconsistently with regards to > > > SEEK_HOLE > > > > and SEEK_DATA operations. The SEEK_HOLE on a data-only file returns > > > st_size > > > > (i.e. EOF + 1), while the SEEK_DATA on a hole-only file returns -1 > and > > > sets > > > > errno to ENXIO. The latter seems to be a documented way to indicate > that > > > > the file has no more data sections past this point. > > > > > > > > My first idea was that somehow most files has a hole attached to its > end > > > to > > > > fill up the FS block, but that does not seem to be a case. Trying to > > > > SEEK_HOLE past the end of any of those data-only files produces an > error > > > > (i.e. lseek(fd, st_size, SEEK_HOLE) == -1). > > > > > > > > In short, for some reason I cannot get proper ENXIO from the > SEEK_HOLE. > > > > What currently returned implies that there is 1-byte hole attached to > > > each > > > > file past its EOF and that does not smell right. > > > > > > > > All tests are done on UFS, fairly recent 11-current. > > > > > > > > > > There is no 'hole-only' files on UFS, the last byte in the UFS file > must > > > be populated, either by allocated fragment if the last byte is in the > > > direct blocks range, or by the full block if in the indirect range. > > > > > > Please show an exact minimal test case which reproduces what you > > > consider the bug, with the comment about the expected outcome in the > > > failing location. > > > > > > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAH7qZftsv_0ersqexJ0fTnSQexe4WvpMLnF6X9bj_wX6q9Ewfw>