Date: Mon, 20 Aug 2018 12:40:56 -0700 From: Kirk McKusick <mckusick@mckusick.com> To: FreeBSD Current <freebsd-current@FreeBSD.org>, FreeBSD Filesystems <freebsd-fs@FreeBSD.org> Subject: CFT: TRIM Consolodation on UFS/FFS filesystems Message-ID: <201808201940.w7KJeu29072094@chez.mckusick.com>
next in thread | raw e-mail | index | archive | help
I have recently added TRIM consolodation support for the UFS/FFS filesystem. This feature consolodates large numbers of TRIM commands into a much smaller number of commands covering larger blocks of disk space. Best described by the commit message: Author: mckusick Date: Sun Aug 19 16:56:42 2018 New Revision: 338056 URL: https://svnweb.freebsd.org/changeset/base/338056 Log: Add consolodation of TRIM / BIO_DELETE commands to the UFS/FFS filesys= tem. = When deleting files on filesystems that are stored on flash-memory (solid-state) disk drives, the filesystem notifies the underlying disk of the blocks that it is no longer using. The notification allows the drive to avoid saving these blocks when it needs to flash (zero out) one of its flash pages. These notifications of no-longer-being-used blocks are referred to as TRIM notifications. In FreeBSD these TRIM notifications are sent from the filesystem to the drive using the BIO_DELETE command. = Until now, the filesystem would send a separate message to the drive for each block of the file that was deleted. Each Gigabyte of file size resulted in over 3000 TRIM messages being sent to the drive. This burst of messages can overwhelm the drive's task queue causing multiple second delays for read and write requests. = This implementation collects runs of contiguous blocks in the file and then consolodates them into a single BIO_DELETE command to the drive. The BIO_DELETE command describes the run of blocks as a single large block being deleted. Each Gigabyte of file size can result in as few as two BIO_DELETE commands and is typically less than ten. Though these larger BIO_DELETE commands take longer to run, they do not clog the drive task queue, so read and write commands can intersperse effectively with them. = Though this new feature has been throughly reviewed and tested, it is being added disabled by default so as to minimize the possibility of disrupting the upcoming 12.0 release. It can be enabled by running ``sysctl vfs.ffs.dotrimcons=3D1''. Users are encouraged to test it. If no problems arise, we will consider requesting that it be enabled by default for 12.0. = Reviewed by: kib Tested by: Peter Holm Sponsored by: Netflix This support is off by default, but I am hoping that I can get enough testing to ensure that it (a) works, and (b) is helpful that it will be reasonable to have it turned on by default in 12.0. The cutoff for turning it on by default in 12.0 is September 19th. So I am requesting your testing feedback in the near-term. Please let me know if you have managed to use it successfully (or not) and also if it provided any performance difference (good or bad). To enable TRIM consolodation either use `sysctl vfs.ffs.dotrimcons=3D1' or just set the `dotrimcons' variable in sys/ufs/ffs/ffs_alloc.c to 1. Everything you need to test TRIM consolodation is obtained by setting the above sysctl. However, if you want to collect statistics on how effective the TRIM consolodation is working, the attached diff will allow you to easily get statitics on how the TRIM is going. Compile your kernel and the mount command. Note that if you do not do a buildworld, you will need to copy /sys/sys/mount.h to /usr/include/sys/mount.h to get the patched mount command to compile. Then run `mount -v' (or `mount -v | grep /mnt' to get just the statistics for /mnt). Removing a 30Mb file without TRIM consolodation: /dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 asy= nc 0, fsid d43f795b6a7d34fb, TRIM: total 952 total blocks 7616) While removing the same file with TRIM consolodation: /dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 asy= nc 0, fsid d43f795b6a7d34fb, TRIM: total 3 total blocks 7616) It also tracks pending blocks and pending files. These numbers are only printed out when they are non-zero. Here is an example running with soft updates right after a file has been rm'ed, but its blocks not yet released= : /dev/md0 on /mnt (ufs, local, soft-updates, writes: sync 2 async 251, read= s: sync 5 async 0, fsid 303f795b1be0c459, pending blocks 7616, pending fil= es 1) Finally it tracks inflight BIO_DELETEs and total blocks represented by those inflight BIO_DELETEs. These numbers are also only printed out when they are non-zero. These statistics let you see how much of a backlog of BIO_DELETEs you have backed up at/in the disk drive and you can track how quickly they drain. Kirk McKusick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201808201940.w7KJeu29072094>