Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 30 Apr 2011 00:28:31 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: TRIM clustering
Message-ID:  <20110430072831.GA65598@icarus.home.lan>
In-Reply-To: <4DBBB20A.5050102@FreeBSD.org>
References:  <4DBBB20A.5050102@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Apr 30, 2011 at 09:54:02AM +0300, Alexander Motin wrote:
> I've noticed that on file deletion from UFS with TRIM enabled, kernel
> issues BIO_DELETE for each 16K (block size?) separately -- thousands per
> second for single big file deletion. Fortunately ada driver will try to
> aggregate them for the device, but won't some clustering code worth to
> be there?

I'd like to know who decided it would be best to submit the TRIM command
automatically on every single block that is deemed free by UFS during
inode removal.  The performance hit, from what I've been reading, from
doing this is quite severe.  Many SSDs take hundreds of milliseconds to
complete TRIM operations, which greatly impacts filesystem performance.
I appreciate the efforts to get TRIM into FreeBSD for UFS, but the
implementation -- if what Alexander says is accurate -- seems like a bad
choice.

Solutions as I see them:

a) Provide appropriate UFS framework to obtain a list of freed blocks (I
do not know much about UFS under the hood so I don't know how to
accomplish this), and let a userspace daemon issue the appropriate
commands to the underlying ATA/CAM layer, providing a list (more
importantly, a range of) LBAs to initiate TRIM for.  Daemon could run at
some particular interval (controlled by the user of course), or meet
sets of required criteria before actually doing it.

b) periodic(8) script (relying on appropriate ways of getting freed
blocks) which could run weekly.  Maybe the TRIM-issuing piece could be
implemented in both atacontrol(8) and camcontrol(8)?

c) Don't want it in userspace?  Okay, make it some kind of kernel
thread.  It still needs to be configurable, probably through sysctl.  It
should also provide some form of accounting details (how many LBAs its
freed, as well as how many times TRIM itself has been run (these are two
separate metrics)).

d) Look at how Linux and/or Windows 7 does this.  I believe Linux
doesn't do it automatically at all, but instead provides necessary
frameworks within libata and their SCSI layer to offer the capability.
There was a script circling within the Linux community called "wiper.sh"
which required use of a very new version of hdparm(8) that would find
freed blocks on ext3 (I think?) and issue hdparm commands to induce
TRIM on sets of LBAs.  ext4 seems to offer some sort of "support" for
this but only when the filesystem is mounted with an option called
"discard" (and specifying that mount option is a manual process).

Catches: whatever method needs to be able to handle the situation where
a device is added on-the-fly (e.g. hot-swap insertion of a new disk), so
for TRIM capability identification, probing kern.disks for TRIM
capability per appropriate ioctls would be ideal.

Other notes: TRIM needs to be supported on swap as well, and in my
opinion this is just as important as it being in UFS.  I'm not sure how
one would implement that.

Sorry for the long-winded Email, but when I see/read about things like
what mav@ has brought up, I become immediately concerned (as someone who
has many production systems using Intel X25-M and Intel 320-series SSDs
for /, swap, /var, /tmp, and /usr).

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110430072831.GA65598>