From owner-freebsd-fs@freebsd.org Mon Feb 1 19:40:25 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3136EA97231 for ; Mon, 1 Feb 2016 19:40:25 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A5C381899; Mon, 1 Feb 2016 19:40:24 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u11JeFt2040703 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 1 Feb 2016 21:40:15 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u11JeFt2040703 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u11JeESv040702; Mon, 1 Feb 2016 21:40:14 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 1 Feb 2016 21:40:14 +0200 From: Konstantin Belousov To: Maxim Sobolev Cc: freebsd-fs@freebsd.org, Kirk McKusick Subject: Re: Inconsistency between lseek(SEEK_HOLE) and lseek(SEEK_DATA) Message-ID: <20160201194014.GQ91220@kib.kiev.ua> References: <20160201165648.GM91220@kib.kiev.ua> <20160201182257.GN91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 19:40:25 -0000 On Mon, Feb 01, 2016 at 11:22:18AM -0800, Maxim Sobolev wrote: > Well, it's still seems to be quite obscure. At the very least, the lseek(2) > manual page needs to reflect that. Right now it says: > > ERRORS > [...] > [ENXIO] For SEEK_DATA, there are no more data regions past > the > supplied offset. For SEEK_HOLE, there are no more > holes past the supplied offset. > > Which is not true, the SEEK_HOLE would return st_size when there are no > more holes past the supplied offset, not ENXIO. It is also interesting that > somehow empty file is a special case as well. Both SEEK_HOLE and SEEK_DATA > return -1 on those. Anybody who programs to that document would probably > get as confused as myself. > > However, having said that, our cousin Linux behaves the same - i.e. returns > EOF+1 on SEEK_HOLE and -1 on SEEK_DATA, and does the same for empty files, > so at least we are consistent with that. Actually, since you referred to the man page for lseek(2), which seems to be copied from the Solaris man page: ... The existence of a hole at the end of every data region allows for easy programming and implies that a virtual hole exists at the end of the file. ... And, the text you quoted, does not imply that the call must return ENXIO at the EOF for hole. It only allows the call to do it, but other language makes this unreasonable. Note that it is Solaris, not Linux, which implementation of the SEEK_HOLE and SEEK_DATA is the arbitration sample for the behavior. We got it with the ZFS import. Our UFS implementation, and whatever Linux does, are only reimplementation without clean documentation, and were done by observing ZFS behaviour.