From owner-freebsd-fs@FreeBSD.ORG Fri Nov 16 05:41:36 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0F82D582; Fri, 16 Nov 2012 05:41:36 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id EF2A88FC16; Fri, 16 Nov 2012 05:41:34 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id gg13so2362330lbb.13 for ; Thu, 15 Nov 2012 21:41:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=87Zgcd60kYFuO7Frna6sZCSQApLan3epo9zCQfUNdI0=; b=ZwV7RxzZyxqjfz7JvXKeJnekR0HdNnaAzXjk7O9Hd/qtSfzwtJKwkezo4CXNPZgqjL iCRNxRfToWZwjNNI7eSFG3fawA4o8Fd3pZK2inDTwOaXOdvWlTomcMQlg2dYZwBavm9i dtbb0d6CAkI/ff1HA9PvWfKs0DzAXklZCUFM5430whBRWhnBsxWQlTgTm9iKl+stDzXu scu0Iuw3eZDpHJrRN71bXXmoUIEyZ5kNtezh2rvPWrXg2h1BgCQIOpuXLAFVlvDx/GZQ gfqJwY7kn7bMm1tccSZe0RnV49YNfey0a2nDOc7+US66hRnKiIF97nkNWKNClky6IHr5 0zsg== MIME-Version: 1.0 Received: by 10.152.106.110 with SMTP id gt14mr3261105lab.1.1353044493951; Thu, 15 Nov 2012 21:41:33 -0800 (PST) Received: by 10.112.49.138 with HTTP; Thu, 15 Nov 2012 21:41:33 -0800 (PST) In-Reply-To: <20121116044055.GA47859@neutralgood.org> References: <57ac1f$gf3rkl@ipmail05.adl6.internode.on.net> <50A31D48.3000700@shatow.net> <20121116044055.GA47859@neutralgood.org> Date: Fri, 16 Nov 2012 00:41:33 -0500 Message-ID: Subject: Re: SSD recommendations for ZFS cache/log From: Zaphod Beeblebrox To: kpneal@pobox.com Content-Type: text/plain; charset=ISO-8859-1 Cc: FreeBSD FS , Bryan Drewery , Eitan Adler , Stephen McKay X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Nov 2012 05:41:36 -0000 On Thu, Nov 15, 2012 at 11:40 PM, wrote: >> + >> + The answer very much depends on the expected workload. >> + Deduplication takes up a signifigent amount of RAM and CPU >> + time and may slow down read and write disk access times. >> + Unless one is storing data that is very heavily >> + duplicated (such as virtual machine images, or user >> + backups) it is likely that deduplication will do more harm >> + than good. Another consideration is the inability to > > I advise against advice that is this firm. The statement that it will "do > more harm than good" really should be omitted. And I'm not sure it is > fair to say it takes a bunch of CPU. Lots of memory, yes, but lots of > CPU isn't so clear. I experimented by enabling DEDUP on a RAID-Z1 pool containing 4x 2T green drives. The system had 8G of RAM and was otherwise quiet. I copied a dataset of about 1T of random stuff onto the array and then copied the same set of data onto the array a second time. The end result is a dedup ration of almost 2.0 and only around 1T of disk used. As I recall (and it's been 6-ish months since I did this), the 2nd write became largely CPU bound with little disk activity. As far as I could tell, the dedup table never thrashed on the disk ... and that most of the disk activity seemed to be creating the directory tree or reading the disk to do the verify step of dedup. The CPU is modest... a 2.6 Ghz Core-2-duo --- and I don't recall if it busied both cores or just one.