Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Nov 2018 15:58:26 -0700
From:      Alan Somers <asomers@freebsd.org>
To:        "Conrad E. Meyer" <cem@freebsd.org>
Cc:        freebsd-arch@freebsd.org, freebsd-fs <freebsd-fs@freebsd.org>,  FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject:   Re: Hole-punching, TRIM, etc
Message-ID:  <CAOtMX2g6TFMrunOgMoG5Wt%2BL3U6z_BAPF7SrM%2BTczQS9gwX1hQ@mail.gmail.com>
In-Reply-To: <CAG6CVpVcr=e=Dmg3JKD0BVQ9wiEWUujThAwy=PXyoyoRr_R7Og@mail.gmail.com>
References:  <CAOtMX2jgb_Pf9-MqirM=xihVpyRmAGZKx2VRnvA_1Fx6kMYXXg@mail.gmail.com> <CAG6CVpVcr=e=Dmg3JKD0BVQ9wiEWUujThAwy=PXyoyoRr_R7Og@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Nov 13, 2018 at 3:51 PM Conrad Meyer <cem@freebsd.org> wrote:

> Hi Alan,
>
> On Tue, Nov 13, 2018 at 2:10 PM Alan Somers <asomers@freebsd.org> wrote:
> >
> > Hole-punching has been discussed on these lists before[1].  It basically
> > means to turn a dense file into a sparse file by deallocating storage for
> > some of the blocks in the middle.  There's no standard API for it.  Linux
> > uses fallocate(2); Solaris and OSX add a new opcode to fcntl(2).
> >
> > A related concept is telling a block device that some blocks are no
> longer
> > used.  SATA calls this "TRIM", SCSI calls it "UNMAP", NVMe calls it
> > "Deallocate", ZBC and ZAC call it "Reset Write Pointer".  They all do
> > basically the same thing, and it's analogous to hole-punching for regular
> > files.  They are also all inaccessible from FreeBSD's userland except by
> > using pass(4), which is inconvenient and protocol-specific.
>
> Geom devices have the DIOCGDELETE ioctl, which translates into
> BIO_DELETE (which is TRIM, as I understand it).  It's available in
> libgeom as g_delete() and used by hastd, newfs_nandfs, and nandtool.
>

Ahh, I thought there must be such a thing, but I couldn't find it.


>
> > Linux has a BLKDISCARD ioctl for issuing TRIM-like commands from
> userland,
> > but it's totally undocumented and doesn't work on regular files.
> >
> > I propose adding support for all of these things using the fcntl(2) API.
> > Using the same syntax that Solaris defined, you would be able to punch a
> > hole in a regular file or TRIM blocks from an SSD.  ZFS already supports
> it
> > (though FreeBSD's port never did, and the code was deleted in r303763).
> > Here's what I would do:
> >
> > 1) Add the F_FREESP command to fcntl(2).
> > 2) Add a .fo_space field for struct fileops
> > 3) Add a devfs_space method that implements .fo_space
> > 4) Add a .d_space field to struct cdevsw
> > 5) Add a g_dev_space method for GEOM that implements .d_space using
> > BIO_DELETE.
> > 6) Add a VOP_SPACE vop
> > 7) Implement VOP_SPACE for tmpfs
> > 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP).
>
> Why not just add DIOCGDELETE support to various VOP_IOCTL
> implementations?  The file objects forward correctly through vn_ioctl
> to VOP_IOCTL for both regular files and devfs VCHR nodes.
>
> We can emulate the Linux API if we want to be compatible there, but I
> wouldn't bother with Solaris.
>

The only reason that I prefer the Solaris API is because it doesn't require
adding another syscall, and because Linux's fallocate(2) does a whole bunch
of other things besides hole-punching.

What about an asynchronous version?  ioctl(2) is still synchronous.  Do you
see any better way to hole-punch/TRIM asynchronously than with aio?


>
> Best,
> Conrad
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2g6TFMrunOgMoG5Wt%2BL3U6z_BAPF7SrM%2BTczQS9gwX1hQ>