From owner-freebsd-fs@freebsd.org Tue Feb 2 05:17:03 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C471CA97014 for ; Tue, 2 Feb 2016 05:17:03 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-wm0-x231.google.com (mail-wm0-x231.google.com [IPv6:2a00:1450:400c:c09::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 59ECC1F72 for ; Tue, 2 Feb 2016 05:17:03 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-wm0-x231.google.com with SMTP id l66so99924797wml.0 for ; Mon, 01 Feb 2016 21:17:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=jNfIEsN6JXND9ch8Xlr6chvanFH7pHjr8ug6efbgcbA=; b=yeweIQXtzHnYbPFyLqQhLalWv1Pc01brKl31GU3tzmDRtD1NE4/Y/PWTGQlRSO9Zww Ao/S9WmSWGegM26wSMynb8Q3arqiJHyDXrmYiGFgs2SqVGNECbFgqZO3RwSkyTrxQQfC Ygo1/FVC7B3fbWSGIE/uKJJT02Ak3Y5PHQucmdjG3Q970KDei6iebtpW07lpyKZdu1hE Jr0vdMed964ZOlQxCn9c1DDCeQ202F2xnUBRJO9/evbRFJ/mPZ7BEeDp6G7B8+iKEE8J CzazPd5E1Ht/h9eq1ZPZLFkwxi2jEg9XkDWFWKfBe2kDppFMEybrtlosTnyP+e2wJqnO FTGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=jNfIEsN6JXND9ch8Xlr6chvanFH7pHjr8ug6efbgcbA=; b=A/h3NKasH3qQNYaYohxo7ZG9pmMkdqc3K/4kmUOxc8UW7Sq/N/SbHmGXsF5eCEo47F OfaqflfUAa1jf3uvvjIgouF1onRNZF/k3nDPwvmnQMLlv7Jm8u65KqmQrUvLVuTX2OSN 9Zk4QtWtlgjZ3XVldAdHWAhLVru5KSDjJs2egpoMkW6ZZuIHJ9+z0dxapNfaE9m+lh7Y SFYHOS6UeydDaoFBYPBAHuM5YLnCax5eglSc+liMEzZdilWSjHnFR7WM4eTjU8UqT1o9 0xXmtbMWRpWQJlYwdv4ZCkTJ/DJG6Qk5uxTvG07iD4jmikUoUuA1bwisAEgOgv/S9NA8 InCA== X-Gm-Message-State: AG10YORUi8vgT40GA47dVtKsAFO/2RW1y5ufpsMy/gcUIZcIariKmDeoVj/8m0Gkpg02iCd+WbkGpfC01YFDbohZ MIME-Version: 1.0 X-Received: by 10.194.192.71 with SMTP id he7mr30949826wjc.82.1454390220627; Mon, 01 Feb 2016 21:17:00 -0800 (PST) Sender: sobomax@sippysoft.com Received: by 10.27.39.195 with HTTP; Mon, 1 Feb 2016 21:17:00 -0800 (PST) In-Reply-To: References: <20160201165648.GM91220@kib.kiev.ua> <20160201182257.GN91220@kib.kiev.ua> <20160201194014.GQ91220@kib.kiev.ua> Date: Mon, 1 Feb 2016 21:17:00 -0800 X-Google-Sender-Auth: DAyD825Bgnim-A0aUywop-OfErU Message-ID: Subject: Re: Inconsistency between lseek(SEEK_HOLE) and lseek(SEEK_DATA) From: Maxim Sobolev To: Konstantin Belousov Cc: freebsd-fs@freebsd.org, Kirk McKusick Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 05:17:03 -0000 WRT the: > There is no 'hole-only' files on UFS, the last byte in the UFS file must > be populated, either by allocated fragment if the last byte is in the > direct blocks range, or by the full block if in the indirect range. Ideed, the UFS resists putting a hole at the end of the file, yet, it's possible to arrange hole-only situation by first truncating an empty file to some size that is greater than the target hole size, so that you get hole of the desired size following by the bit of data, and then truncating the resulting file back to the offset where the data starts: ----- fd = open(fname, O_WRONLY | O_CREAT | O_TRUNC, DEFFILEMODE); if (fd == -1) { exit (1); } if (ftruncate(fd, 1024 * 128) < 0) { exit (1); } data = lseek(fd, 0, SEEK_DATA); if (data >= 0 && ftruncate(fd, data) < 0) { exit (1); } ----- [sobomax@rtpdev ~/projects/freebsd11/usr.bin/lsholes]$ ./lsholes /tmp/temp.MgoPPo Type Start End Size HOLE 0 98303 98304 Total HOLE: 98304 (100.00%) Total DATA: 0 (0.00%) [sobomax@rtpdev ~/projects/freebsd11/usr.bin/lsholes]$ ls -l /tmp/temp.MgoPPo -rw-r--r-- 1 sobomax wheel 98304 Feb 1 21:06 /tmp/temp.MgoPPo ----- I don't know if operating on that file would result in some data corruption, but I also seem have no issues creating hole-only files on ZFS using my fallocate(2) syscall. -Max On Mon, Feb 1, 2016 at 12:14 PM, Maxim Sobolev wrote: > Yeah, I've noticed that text now. It looks a lot like the sentence has > been copied around and some part of it had lost in transition. In any case > here is a small manpage patch to make a "vurtual hole" more pronounced and > also explain how it affects return value of the syscall. > > https://reviews.freebsd.org/D5162 > > On Mon, Feb 1, 2016 at 11:40 AM, Konstantin Belousov > wrote: > >> On Mon, Feb 01, 2016 at 11:22:18AM -0800, Maxim Sobolev wrote: >> > Well, it's still seems to be quite obscure. At the very least, the >> lseek(2) >> > manual page needs to reflect that. Right now it says: >> > >> > ERRORS >> > [...] >> > [ENXIO] For SEEK_DATA, there are no more data regions >> past >> > the >> > supplied offset. For SEEK_HOLE, there are no >> more >> > holes past the supplied offset. >> > >> > Which is not true, the SEEK_HOLE would return st_size when there are no >> > more holes past the supplied offset, not ENXIO. It is also interesting >> that >> > somehow empty file is a special case as well. Both SEEK_HOLE and >> SEEK_DATA >> > return -1 on those. Anybody who programs to that document would probably >> > get as confused as myself. >> > >> > However, having said that, our cousin Linux behaves the same - i.e. >> returns >> > EOF+1 on SEEK_HOLE and -1 on SEEK_DATA, and does the same for empty >> files, >> > so at least we are consistent with that. >> >> Actually, since you referred to the man page for lseek(2), which seems to >> be copied from the Solaris man page: >> ... >> The existence of a hole at the end of every data region allows for easy >> programming and implies that a virtual hole exists at the end of the >> file. >> ... >> >> And, the text you quoted, does not imply that the call must return ENXIO >> at the EOF for hole. It only allows the call to do it, but other language >> makes this unreasonable. >> >> Note that it is Solaris, not Linux, which implementation of the SEEK_HOLE >> and SEEK_DATA is the arbitration sample for the behavior. We got it with >> the ZFS import. Our UFS implementation, and whatever Linux does, are only >> reimplementation without clean documentation, and were done by observing >> ZFS behaviour. >> >> >