Date: Sat, 30 Apr 2011 00:28:31 -0700 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: Alexander Motin <mav@FreeBSD.org> Cc: freebsd-fs@freebsd.org Subject: Re: TRIM clustering Message-ID: <20110430072831.GA65598@icarus.home.lan> In-Reply-To: <4DBBB20A.5050102@FreeBSD.org> References: <4DBBB20A.5050102@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Apr 30, 2011 at 09:54:02AM +0300, Alexander Motin wrote: > I've noticed that on file deletion from UFS with TRIM enabled, kernel > issues BIO_DELETE for each 16K (block size?) separately -- thousands per > second for single big file deletion. Fortunately ada driver will try to > aggregate them for the device, but won't some clustering code worth to > be there? I'd like to know who decided it would be best to submit the TRIM command automatically on every single block that is deemed free by UFS during inode removal. The performance hit, from what I've been reading, from doing this is quite severe. Many SSDs take hundreds of milliseconds to complete TRIM operations, which greatly impacts filesystem performance. I appreciate the efforts to get TRIM into FreeBSD for UFS, but the implementation -- if what Alexander says is accurate -- seems like a bad choice. Solutions as I see them: a) Provide appropriate UFS framework to obtain a list of freed blocks (I do not know much about UFS under the hood so I don't know how to accomplish this), and let a userspace daemon issue the appropriate commands to the underlying ATA/CAM layer, providing a list (more importantly, a range of) LBAs to initiate TRIM for. Daemon could run at some particular interval (controlled by the user of course), or meet sets of required criteria before actually doing it. b) periodic(8) script (relying on appropriate ways of getting freed blocks) which could run weekly. Maybe the TRIM-issuing piece could be implemented in both atacontrol(8) and camcontrol(8)? c) Don't want it in userspace? Okay, make it some kind of kernel thread. It still needs to be configurable, probably through sysctl. It should also provide some form of accounting details (how many LBAs its freed, as well as how many times TRIM itself has been run (these are two separate metrics)). d) Look at how Linux and/or Windows 7 does this. I believe Linux doesn't do it automatically at all, but instead provides necessary frameworks within libata and their SCSI layer to offer the capability. There was a script circling within the Linux community called "wiper.sh" which required use of a very new version of hdparm(8) that would find freed blocks on ext3 (I think?) and issue hdparm commands to induce TRIM on sets of LBAs. ext4 seems to offer some sort of "support" for this but only when the filesystem is mounted with an option called "discard" (and specifying that mount option is a manual process). Catches: whatever method needs to be able to handle the situation where a device is added on-the-fly (e.g. hot-swap insertion of a new disk), so for TRIM capability identification, probing kern.disks for TRIM capability per appropriate ioctls would be ideal. Other notes: TRIM needs to be supported on swap as well, and in my opinion this is just as important as it being in UFS. I'm not sure how one would implement that. Sorry for the long-winded Email, but when I see/read about things like what mav@ has brought up, I become immediately concerned (as someone who has many production systems using Intel X25-M and Intel 320-series SSDs for /, swap, /var, /tmp, and /usr). -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110430072831.GA65598>