From owner-freebsd-fs@freebsd.org Mon Feb 1 18:23:02 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D6A74A97AC0 for ; Mon, 1 Feb 2016 18:23:02 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 81AD517AD; Mon, 1 Feb 2016 18:23:02 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u11IMvkO023135 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 1 Feb 2016 20:22:57 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u11IMvkO023135 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u11IMvaF023134; Mon, 1 Feb 2016 20:22:57 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 1 Feb 2016 20:22:57 +0200 From: Konstantin Belousov To: Maxim Sobolev Cc: freebsd-fs@freebsd.org, Kirk McKusick Subject: Re: Inconsistency between lseek(SEEK_HOLE) and lseek(SEEK_DATA) Message-ID: <20160201182257.GN91220@kib.kiev.ua> References: <20160201165648.GM91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 18:23:03 -0000 On Mon, Feb 01, 2016 at 09:17:49AM -0800, Maxim Sobolev wrote: > Here it is: > > The expected outcome is return code 0, the failure condition is in the > lseek() returning 4 (i.e. sizeof(int)), not -1. > > ------ > #include > #include > #include > #include > #include > #include > > int main(void) > { > char tempname[] = "/tmp/temp.XXXXXX"; > char *fname; > int fd; > off_t hole; > > fname = mktemp(tempname); > if (fname == NULL) { > exit (1); > } > fd = open(fname, O_WRONLY | O_CREAT | O_TRUNC, DEFFILEMODE); > if (fd == -1) { > exit (1); > } > if (write(fd, &fd, sizeof(fd)) <= 0) { > exit (1); > } > hole = lseek(fd, 0, SEEK_HOLE); > close(fd); > unlink(fname); > if (hole >= 0) { > fprintf(stderr, "lseek() returned %jd, not -1\n", > (intmax_t)hole); > exit (1); > } > exit (0); > } > ------ I tested you program on both UFS and ZFS, and the behaviour is identical, lseek(SEEK_HOLE) points to the end of file. In fact, when I did UFS implementation, I most likely considered this case and tested ZFS compatibility, because the case is handled explicitely. Look at the lines 2193-2197 in kern/vfs_vnops.c:vn_bmap_seekhole(), esp. the comment. For me, the results of the test are reasonable. There is no data after EOF, and the idea of 'implicit hole' after EOF is one which is quite intuitive. > > > On Mon, Feb 1, 2016 at 8:56 AM, Konstantin Belousov > wrote: > > > On Mon, Feb 01, 2016 at 07:57:40AM -0800, Maxim Sobolev wrote: > > > Hi, > > > > > > I've noticed that lseek() behaved inconsistently with regards to > > SEEK_HOLE > > > and SEEK_DATA operations. The SEEK_HOLE on a data-only file returns > > st_size > > > (i.e. EOF + 1), while the SEEK_DATA on a hole-only file returns -1 and > > sets > > > errno to ENXIO. The latter seems to be a documented way to indicate that > > > the file has no more data sections past this point. > > > > > > My first idea was that somehow most files has a hole attached to its end > > to > > > fill up the FS block, but that does not seem to be a case. Trying to > > > SEEK_HOLE past the end of any of those data-only files produces an error > > > (i.e. lseek(fd, st_size, SEEK_HOLE) == -1). > > > > > > In short, for some reason I cannot get proper ENXIO from the SEEK_HOLE. > > > What currently returned implies that there is 1-byte hole attached to > > each > > > file past its EOF and that does not smell right. > > > > > > All tests are done on UFS, fairly recent 11-current. > > > > > > > There is no 'hole-only' files on UFS, the last byte in the UFS file must > > be populated, either by allocated fragment if the last byte is in the > > direct blocks range, or by the full block if in the indirect range. > > > > Please show an exact minimal test case which reproduces what you > > consider the bug, with the comment about the expected outcome in the > > failing location. > > > >