From owner-freebsd-fs@FreeBSD.ORG Fri Dec 10 10:26:46 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FBF6106564A for ; Fri, 10 Dec 2010 10:26:46 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from out-0.mx.aerioconnect.net (out-0-11.mx.aerioconnect.net [216.240.47.71]) by mx1.freebsd.org (Postfix) with ESMTP id 7088D8FC1C for ; Fri, 10 Dec 2010 10:26:46 +0000 (UTC) Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160]) by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id oBAAQidC020002; Fri, 10 Dec 2010 02:26:45 -0800 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id DEA672D6013; Fri, 10 Dec 2010 02:26:43 -0800 (PST) Message-ID: <4D020062.3030704@freebsd.org> Date: Fri, 10 Dec 2010 02:26:42 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6 MIME-Version: 1.0 To: Christoph Hellwig References: <201012091813.oB9IDd2H078366@chez.mckusick.com> <20101210003749.3F7E15B92@mail.bitblocks.com> <20101210005838.GD1866@garage.freebsd.pl> <4D01B878.4020008@freebsd.org> <20101210083613.GA12835@infradead.org> In-Reply-To: <20101210083613.GA12835@infradead.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51 Cc: freebsd-fs@freebsd.org Subject: Re: TRIM support for UFS? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 10:26:46 -0000 private response. BTW as you seem to be a Linux person how did you end up here? (not a problem, just curious) On 12/10/10 12:36 AM, Christoph Hellwig wrote: > On Thu, Dec 09, 2010 at 09:19:52PM -0800, Julian Elischer wrote: >> One of the things that has not been mentioned is that trim is not >> really 'free' >> (at least not for us) if you want things to remain trimmed after a reboot. >> so if I were implementing it I'd want a couple of parameters. >> >> 1/ don't bother trimming free space under some size. I didn't really explain well, but I meant that this would be a filesystem parameter. SCSI would allow the filesystem to determine it automatically but not all devices are SCSI (or SATA). (e.g. the ones we make at Fusion-IO (www.fusionio.com)) >> 2/ does it matter if the trimmed space comes back as garbage >> after an unclean shutdown? (a hint to the driver, and no, I don't >> know anyone that supports this yet) >> (there are security implications to that one but cheap trim (that >> may come back) is way cheaper than persistent trim to impliment). > Please take a look at the SCSI and ATA standards for thin provisioning > and TRIM. It's been many years since I wrote the scsi code for MACH and BSD (long since handed off to others) and I haven't really followed the scsi spec for a while, but I will follow up and look at these pages. > For SCSI there are EVPD pages with a lot of information about the > required trim granularity and alignment. For SCSI an UNMAP or WRITE > SAME with the unmap bit set always guarantees deterministic behaviour > after a discard of used data, which is optional in ATA, but all SSDs > I have access to claim to support it (but at least one didn't do it > properly). but do they support the option to be nodeterministic about it :-) When you trim a block range you have to store that information somewhere. and that in turn can cause a cascade of problems when you are trying to cope with unclean shutdown. (I work for fusionIO with Jens Axboe and Nick Piggins and my job includes writing code to do exactly that). it takes cpu and ram and doing so detracts from overall performance so if your drive has an option to make trims "non persistant" in the case of power failure you can use that fact to increase the performance in other ways. > Both SBC and ATA have a bit that requires any discarded > space to be zeroed, which is very important for RAID or virtualization > use cases. I would suggest to let the BSD implementation mirror those > standards - that's what we've done for Linux. Another interesting > angle is that the SCSI UNAMP command and the ATA TRIM command support > ranges of blocks to discard, which is a feature that Windows 7 uses > a lot on SSDs, given that the overhead of a single TRIM can be quite bad > given that it's a non-queueable command. yes that's a side effect of some of the hidden complexities of 'trim' I think. > If you want to add ioctls for trimming on raw block devices, the free > space on a filesystem, or punching holes please take a look at the > existing Linux BLKDISCARD and FITRIM ioctls, as well as the fallocate > extension to punch holes that's currently under discussion. I'll leave it to Jens to work on the ioctls for linux, but we have the same driver for Windows, Linux, FreeBSD, OS-X, AIX, ESX, and Solaris with different adaptation layers so what we decide to export is pretty flexible. regards Julian