From owner-freebsd-fs@FreeBSD.ORG Sat Nov 17 22:59:00 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 04DE03B5 for ; Sat, 17 Nov 2012 22:59:00 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from mail.egr.msu.edu (hill.egr.msu.edu [35.9.37.162]) by mx1.freebsd.org (Postfix) with ESMTP id BF10B8FC0C for ; Sat, 17 Nov 2012 22:58:59 +0000 (UTC) Received: from hill (localhost [127.0.0.1]) by mail.egr.msu.edu (Postfix) with ESMTP id C80072FB12; Sat, 17 Nov 2012 17:58:52 -0500 (EST) X-Virus-Scanned: amavisd-new at egr.msu.edu Received: from mail.egr.msu.edu ([127.0.0.1]) by hill (hill.egr.msu.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z_igZraJvWMh; Sat, 17 Nov 2012 17:58:52 -0500 (EST) Received: from daemon.localdomain (daemon.egr.msu.edu [35.9.44.65]) by mail.egr.msu.edu (Postfix) with ESMTP id A7A882FB0B; Sat, 17 Nov 2012 17:58:51 -0500 (EST) Received: by daemon.localdomain (Postfix, from userid 21281) id 9A1FB1815F; Sat, 17 Nov 2012 17:58:51 -0500 (EST) Date: Sat, 17 Nov 2012 17:58:51 -0500 From: Adam McDougall To: kpneal@pobox.com Subject: Re: SSD recommendations for ZFS cache/log Message-ID: <20121117225851.GJ1462@egr.msu.edu> References: <57ac1f$gf3rkl@ipmail05.adl6.internode.on.net> <50A31D48.3000700@shatow.net> <20121116044055.GA47859@neutralgood.org> <50A64694.5030001@egr.msu.edu> <20121117181803.GA26421@neutralgood.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121117181803.GA26421@neutralgood.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Nov 2012 22:59:00 -0000 On Sat, Nov 17, 2012 at 01:18:03PM -0500, kpneal@pobox.com wrote: On Fri, Nov 16, 2012 at 08:58:44AM -0500, Adam McDougall wrote: > On 11/16/12 00:41, Zaphod Beeblebrox wrote: > > On Thu, Nov 15, 2012 at 11:40 PM, wrote: > >>> + > >>> + The answer very much depends on the expected workload. > >>> + Deduplication takes up a signifigent amount of RAM and CPU > >>> + time and may slow down read and write disk access times. > >>> + Unless one is storing data that is very heavily > >>> + duplicated (such as virtual machine images, or user > >>> + backups) it is likely that deduplication will do more harm > >>> + than good. Another consideration is the inability to > >> > >> I advise against advice that is this firm. The statement that it will "do > >> more harm than good" really should be omitted. And I'm not sure it is > >> fair to say it takes a bunch of CPU. Lots of memory, yes, but lots of > >> CPU isn't so clear. > > > > I experimented by enabling DEDUP on a RAID-Z1 pool containing 4x 2T > > green drives. The system had 8G of RAM and was otherwise quiet. I > > copied a dataset of about 1T of random stuff onto the array and then > > copied the same set of data onto the array a second time. The end > > result is a dedup ration of almost 2.0 and only around 1T of disk > > used. > > > > As I recall (and it's been 6-ish months since I did this), the 2nd > > write became largely CPU bound with little disk activity. As far as I > > could tell, the dedup table never thrashed on the disk ... and that > > most of the disk activity seemed to be creating the directory tree or > > reading the disk to do the verify step of dedup. Well, yes, it was CPU bound because it wasn't disk bound. All filesystem activity is going to be either disk bound, CPU bound, or waiting for more filesystem requests (eg, network bound or similar). Also note that the original text above said that dedup only made sense with heavily duplicated data. That's exactly the case you tested. So your test says nothing about the case where there isn't much duplicated data. The phrase I advised against was referring to the case you didn't test. > Now try deleting some data and the fun begins :) You've had a bad experience? I'd love to hear about it. -- Kevin P. Neal http://www.pobox.com/~kpn/ "Oh, I've heard that paradox a couple of times, but there's something about a cat dying and I hate to think of such things." - Dr. Donald Knuth speaking of Schrodinger's cat, December 8, 1999, MIT Deleting data takes significantly longer than usual because it has to un-dedupe the data, which takes longer than most people expect, and ties up the removal process until it is done. During that time, the CPU is pegged pretty hard and the disks are active but not doing much. I haven't had the opportunity to try this with a large memory system or one with snappy l2arc to see if it is better. This can spiral in at least two ways. For one, the average system admin will not expect it to take so long to delete files and think something is wrong. If this happens in small amounts, they may decide to disable dedupe if they realize that is the cause. But, since the data is already deduped, they are stuck with that behavior until the data is copied fresh or deleted. Doing THAT can take an enormous amount of time, progressing at a slow pace, and has a chance of leading to a deadlock (not making this up). If a deadlock occurs while they are trying to solve this issue, tempers flare even further, especially since the next reboot will continue thrashing the disks where it left off but perhaps before the admin has a chance to log in and figure out what is happening, which isn't obvious. Worse yet, if a lot of data has been deleted, another deadlock may occur. Rinse, Repeat, swear at ZFS, perhaps vow that dedupe is "not ready" and a quiet threat. There have been several people on the FreeBSD mailing lists that have had these symptoms. Some of them added ram to get past it. Some found a way to measure progress and kept letting it churn/deadlock/reboot until things came back to normal. I think in -current there is a new zfs feature allowing for background deletion that may ease this issue, and someone reported success.