Date: Mon, 12 Feb 2018 09:04:57 -0800 From: John Baldwin <jhb@freebsd.org> To: freebsd-current@freebsd.org Cc: Garrett Wollman <wollman@hergotha.csail.mit.edu>, asomers@freebsd.org Subject: Re: posix_fallocate on ZFS Message-ID: <1868530.6C5Wu4I1lN@ralph.baldwin.cx> In-Reply-To: <201802101846.w1AIkX4Y000167@hergotha.csail.mit.edu> References: <CAOtMX2jZr_kvJgOZWeiB-AZ3-7-uUu%2BUQ3P0nKhGZ0eNRzwMOQ@mail.gmail.com> <1e2f43fd-85da-6629-62d1-6e96790278e5@digiware.nl> <201802101846.w1AIkX4Y000167@hergotha.csail.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday, February 10, 2018 01:46:33 PM Garrett Wollman wrote: > In article > <CAOtMX2jZr_kvJgOZWeiB-AZ3-7-uUu+UQ3P0nKhGZ0eNRzwMOQ@mail.gmail.com>, > asomers@freebsd.org writes: > > >On Sat, Feb 10, 2018 at 10:28 AM, Willem Jan Withagen <wjw@digiware.nl> > >wrote: > > >> Is there any expectation that this is going to fixed in any near future? > > >No. It's fundamentally impossible to support posix_fallocate on a COW > >filesystem like ZFS. Ceph should be taught to ignore an EINVAL result, > >since the system call is merely advisory. > > I don't think it's true that this is _fundamentally_ impossible. What > the standard requires would in essence be a per-object refreservation. > ZFS supports refreservation, obviously, but not on a per-object basis. > Furthermore, there are mechanisms to preallocate blocks for things > like dumps. So it *could* be done (as in, the concept is there), but > it may not be practical. (And ultimately, there are ways in which the > administrator might manage the system that would defeat the desired > effect, but that's out of the standard's scope.) Given the semantic > mismatch, though, I suspect it's unreasonable to expect anyone to > prioritize implementation of such a feature. I don't think posix_fallocate() can be compatible with COW. Suppose you do reserve a fixed set of blocks. That ensures the first write has a place to write, but not if you overwrite one of those blocks. You'd have to reserve another block to maintain the reservation each time you wrote to a block, or you'd have to have a way to mark a file as not COW. The first case isn't really any better than not using posix_fallocate() in the first place as you are still requiring writes to allocate blocks, and the second seems a bit fraught with peril as well if the application is expecting the non-COW'd file to be in sync with other files in the system since presumably non-COW'd files couldn't be snapshotted, etc. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1868530.6C5Wu4I1lN>