From owner-freebsd-hackers@freebsd.org Wed Dec 23 21:57:02 2015 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89B18A5087E for ; Wed, 23 Dec 2015 21:57:02 +0000 (UTC) (envelope-from amesbury@oitsec.umn.edu) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 7390E152A for ; Wed, 23 Dec 2015 21:57:02 +0000 (UTC) (envelope-from amesbury@oitsec.umn.edu) Received: by mailman.ysv.freebsd.org (Postfix) id 707ECA5087D; Wed, 23 Dec 2015 21:57:02 +0000 (UTC) Delivered-To: hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 700E1A5087C for ; Wed, 23 Dec 2015 21:57:02 +0000 (UTC) (envelope-from amesbury@oitsec.umn.edu) Received: from mail.oitsec.umn.edu (mail.oitsec.umn.edu [128.101.238.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.oitsec.umn.edu", Issuer "InCommon RSA Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4C67A1529 for ; Wed, 23 Dec 2015 21:57:02 +0000 (UTC) (envelope-from amesbury@oitsec.umn.edu) Received: from mail.oitsec.umn.edu (localhost [127.0.0.1]) by mail.oitsec.umn.edu (Postfix) with ESMTP id E82D65C80C for ; Wed, 23 Dec 2015 15:56:53 -0600 (CST) X-Virus-Scanned: amavisd-new at oitsec.umn.edu Received: from mail.oitsec.umn.edu ([127.0.0.1]) by mail.oitsec.umn.edu (mail.oitsec.umn.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZCul24txdyOO for ; Wed, 23 Dec 2015 15:56:52 -0600 (CST) Received: from optimator.oitsec.umn.edu (optimator.oitsec.umn.edu [134.84.23.1]) (Authenticated sender: amesbury) by mail.oitsec.umn.edu (Postfix) with ESMTPSA id B0F2D5C80A for ; Wed, 23 Dec 2015 15:56:52 -0600 (CST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\)) Subject: Re: The minimum amount of memory needed to use ZFS. From: Alan Amesbury In-Reply-To: <26557C02-C591-4232-BBD0-988B0EB89575@gid.co.uk> Date: Wed, 23 Dec 2015 15:56:53 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <411CC5EB-012B-43FC-B7E0-5D09D3CA3E55@oitsec.umn.edu> References: <20151223121445.GA85016@ozzmosis.com> <26557C02-C591-4232-BBD0-988B0EB89575@gid.co.uk> To: hackers@freebsd.org X-Mailer: Apple Mail (2.3096.5) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Dec 2015 21:57:02 -0000 On Dec 23, 2015, at 11:53 , Bob Bishop wrote: [snip] > Deduplication seems like a very bad idea unless you have both a lot of = duplicated data and a serious shortage of disk. It needs a lot of RAM, = increasing over time. Depending on the hardware and the use case, = compression (which effectively only costs CPU) might be a better option. Agreed: Deduplication isn't something you want to enable until you're = sure you have a workload that's suitable for it. Memory usage increases = on Freebsd to an estimated 2-5GB per terabyte of zpool[1]. Oracle has = published[2] some information on deduplication in ZFS, too, which = parallels information in the FreeBSD wiki, namely the use of 'zdb' to = analyze your data to determine if deduplication is even worthwhile. = Note this can take a while to run and, at least for me, had issues = running on at least one of my hosts. Output is pretty straightforward. For example: # zdb -S pool Simulated DDT histogram: bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 4.94M 578G 577G 579G 4.94M 578G 577G 579G 2 416K 50.5G 50.5G 50.5G 922K 112G 112G 112G 4 39.6K 4.89G 4.89G 4.89G 175K 21.6G 21.6G 21.6G 8 3.06K 382M 381M 382M 31.6K 3.85G 3.84G 3.85G 16 306 34.4M 33.3M 33.4M 5.81K 665M 639M 641M 32 62 6.13M 4.99M 5.04M 2.77K 281M 230M 232M 64 41 4.88M 4.88M 4.88M 3.56K 432M 432M 433M 128 25 3.12M 3.12M 3.12M 4.37K 560M 560M 560M 256 71 8.88M 8.88M 8.88M 20.4K 2.56G 2.56G 2.56G 512 2 256K 256K 256K 1.27K 163M 163M 163M 2K 2 256K 256K 256K 4.19K 536M 536M 536M 128K 1 128K 128K 128K 148K 18.4G 18.4G 18.4G Total 5.39M 634G 633G 634G 6.23M 739G 739G 740G dedup =3D 1.17, compress =3D 1.00, copies =3D 1.00, dedup * compress / = copies =3D 1.17 For this host there's some evidence that deduplication might buy me a = small amount of additional space, but I'd rather allocate RAM to the ARC = for performance instead of using it for what looks like a small = reduction in space usage. For my workloads, I tend to get a much bigger = boost from using compression, as modern CPUs can typically compress = pretty close to the speed of rotational media. (SSDs would be a = different story.) Example 'zdb -S' output from a host using = compression: # zdb -S pool Simulated DDT histogram: bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 8.25M 1008G 80.6G 80.6G 8.25M 1008G 80.6G 80.6G 2 697 76.4M 21.3M 21.3M 1.46K 160M 44.8M 44.8M 4 1.05K 10.2M 3.34M 3.34M 5.15K 48.6M 15.9M 15.9M 8 65 1.09M 318K 318K 649 10.8M 3.06M 3.06M 16 23 904K 300K 300K 558 20.1M 6.55M 6.55M 32 18 1.78M 681K 681K 770 74.2M 27.7M 27.7M 64 29 3.27M 1.23M 1.23M 2.61K 305M 115M 115M 128 15 1.41M 536K 536K 2.38K 209M 77.3M 77.3M Total 8.25M 1008G 80.6G 80.6G 8.26M 1009G 80.9G 80.9G dedup =3D 1.00, compress =3D 12.47, copies =3D 1.00, dedup * compress / = copies =3D 12.51 The data, primarily textual log files of some kind, compresses pretty = well. --=20 Alan Amesbury University Information Security http://umn.edu/lookup/amesbury [1] - https://wiki.freebsd.org/ZFSTuningGuide#Deduplication [2] - = http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-113-s= ize-zfs-dedup-1354231.html=