Date: Tue, 13 Nov 2018 15:50:35 -0700 From: Warner Losh <imp@bsdimp.com> To: Alan Somers <asomers@freebsd.org> Cc: "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, FreeBSD FS <freebsd-fs@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: Hole-punching, TRIM, etc Message-ID: <CANCZdfp5UDcH-SLDVvvhkB0dTnhuP0tZ8YT0tUJkF8egAZgYuA@mail.gmail.com> In-Reply-To: <CAOtMX2jgb_Pf9-MqirM=xihVpyRmAGZKx2VRnvA_1Fx6kMYXXg@mail.gmail.com> References: <CAOtMX2jgb_Pf9-MqirM=xihVpyRmAGZKx2VRnvA_1Fx6kMYXXg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Nov 13, 2018 at 3:10 PM Alan Somers <asomers@freebsd.org> wrote: > Hole-punching has been discussed on these lists before[1]. It basically > means to turn a dense file into a sparse file by deallocating storage for > some of the blocks in the middle. There's no standard API for it. Linux > uses fallocate(2); Solaris and OSX add a new opcode to fcntl(2). > > A related concept is telling a block device that some blocks are no longer > used. SATA calls this "TRIM", SCSI calls it "UNMAP", NVMe calls it > "Deallocate", ZBC and ZAC call it "Reset Write Pointer". They all do > basically the same thing, and it's analogous to hole-punching for regular > files. They are also all inaccessible from FreeBSD's userland except by > using pass(4), which is inconvenient and protocol-specific. > > Linux has a BLKDISCARD ioctl for issuing TRIM-like commands from userland, > but it's totally undocumented and doesn't work on regular files. > > I propose adding support for all of these things using the fcntl(2) API. > Using the same syntax that Solaris defined, you would be able to punch a > hole in a regular file or TRIM blocks from an SSD. ZFS already supports it > (though FreeBSD's port never did, and the code was deleted in r303763). > Here's what I would do: > > 1) Add the F_FREESP command to fcntl(2). > 2) Add a .fo_space field for struct fileops > 3) Add a devfs_space method that implements .fo_space > 4) Add a .d_space field to struct cdevsw > 5) Add a g_dev_space method for GEOM that implements .d_space using > BIO_DELETE. > 6) Add a VOP_SPACE vop > 7) Implement VOP_SPACE for tmpfs > 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP). > > The greatest beneficiaries of this work would be type 2 hypervisors like > QEMU and VirtualBox with guests that use TRIM, and userland filesystems > such as fusefs-ext2 and fusefs-exfat. High-performance storage systems > using SPDK would also benefit. The last item, aio_freesp(2), may seem > unnecessary but it would really benefit my application. > > Questions, objections, flames? > So the fcntl would deallocate blocks from a filesystem only. The filesystem may issue BIO_DELETE as a result, but that's up to the filesystem, correct? On a raw device it would be translated into a BIO_DELETE command directly, correct? Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfp5UDcH-SLDVvvhkB0dTnhuP0tZ8YT0tUJkF8egAZgYuA>