Date: Fri, 16 Nov 2012 00:41:33 -0500 From: Zaphod Beeblebrox <zbeeble@gmail.com> To: kpneal@pobox.com Cc: FreeBSD FS <freebsd-fs@freebsd.org>, Bryan Drewery <bryan@shatow.net>, Eitan Adler <eadler@freebsd.org>, Stephen McKay <smckay@internode.on.net> Subject: Re: SSD recommendations for ZFS cache/log Message-ID: <CACpH0MfQWokFZkh58qm%2B2_tLeSby9BWEuGjkH15Nu3%2BS1%2Bp3SQ@mail.gmail.com> In-Reply-To: <20121116044055.GA47859@neutralgood.org> References: <CAFHbX1K-NPuAy5tW0N8=sJD=CU0Q1Pm3ZDkVkE%2BdjpCsD1U8_Q@mail.gmail.com> <57ac1f$gf3rkl@ipmail05.adl6.internode.on.net> <50A31D48.3000700@shatow.net> <CAF6rxgkh6C0LKXOZa264yZcA3AvQdw7zVAzWKpytfh0%2BKnLOJg@mail.gmail.com> <20121116044055.GA47859@neutralgood.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Nov 15, 2012 at 11:40 PM, <kpneal@pobox.com> wrote: >> + <answer> >> + <para>The answer very much depends on the expected workload. >> + Deduplication takes up a signifigent amount of RAM and CPU >> + time and may slow down read and write disk access times. >> + Unless one is storing data that is very heavily >> + duplicated (such as virtual machine images, or user >> + backups) it is likely that deduplication will do more harm >> + than good. Another consideration is the inability to > > I advise against advice that is this firm. The statement that it will "do > more harm than good" really should be omitted. And I'm not sure it is > fair to say it takes a bunch of CPU. Lots of memory, yes, but lots of > CPU isn't so clear. I experimented by enabling DEDUP on a RAID-Z1 pool containing 4x 2T green drives. The system had 8G of RAM and was otherwise quiet. I copied a dataset of about 1T of random stuff onto the array and then copied the same set of data onto the array a second time. The end result is a dedup ration of almost 2.0 and only around 1T of disk used. As I recall (and it's been 6-ish months since I did this), the 2nd write became largely CPU bound with little disk activity. As far as I could tell, the dedup table never thrashed on the disk ... and that most of the disk activity seemed to be creating the directory tree or reading the disk to do the verify step of dedup. The CPU is modest... a 2.6 Ghz Core-2-duo --- and I don't recall if it busied both cores or just one.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACpH0MfQWokFZkh58qm%2B2_tLeSby9BWEuGjkH15Nu3%2BS1%2Bp3SQ>