From owner-freebsd-fs@FreeBSD.ORG  Thu Dec  1 13:15:42 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5FCC8106566C
	for <freebsd-fs@freebsd.org>; Thu,  1 Dec 2011 13:15:42 +0000 (UTC)
	(envelope-from peter.maloney@brockmann-consult.de)
Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.17.9])
	by mx1.freebsd.org (Postfix) with ESMTP id 0AAE18FC13
	for <freebsd-fs@freebsd.org>; Thu,  1 Dec 2011 13:15:41 +0000 (UTC)
Received: from [10.3.0.26] ([141.4.215.32])
	by mrelayeu.kundenserver.de (node=mrbap2) with ESMTP (Nemesis)
	id 0LZl8O-1R6txW1Px6-00lXMq; Thu, 01 Dec 2011 14:03:06 +0100
Message-ID: <4ED77B09.1090709@brockmann-consult.de>
Date: Thu, 01 Dec 2011 14:03:05 +0100
From: Peter Maloney <peter.maloney@brockmann-consult.de>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
	rv:1.9.2.23) Gecko/20110922 Thunderbird/3.1.15
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <CAEUA181wUZC-KjVwcm=tTY0DoBLzrNAuBF3aFimSbLB=xht0jw@mail.gmail.com>
	<CALfReycy29VdegrmDrBJ7U3Mjt7+OxUvN7hxOKHOqSX4jD5_kg@mail.gmail.com>
In-Reply-To: <CALfReycy29VdegrmDrBJ7U3Mjt7+OxUvN7hxOKHOqSX4jD5_kg@mail.gmail.com>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Provags-ID: V02:K0:G1UGavzIb6bbsSgiyNIc2m1vpNgA31/P1CPs5qAOdAt
	RV1/upUphKiF8sYKQoWCpz8dPSohf14EUbKlGDc0mwO7IRxLXO
	Sw9+4gDXcyPjxHbeKWGxazVzpA46hH31+7AVer+JZJUxKDHjB8
	o0/aFyqESBGIsBm6rdMLnZFA1z05UxIbHgn57nRYFqFlGSfuke
	5gmBsdprGl4iTqSbCm2TxYB1bE///k/toSQpfb3uYQrAlUREFL
	lHVoVk0a1cGHhBP2AVeIgFoXxcX0pk7OBDOikK82ppOzV49OS2
	KlJCrF5+xYy18Vc6Iv8iZRrtzJG2phaPXdE6K/GkQHWrdQQc9k
	9rKxioX6wiJnG/ShKbGU9jNjnpkTiVSdpktvV22Yf
Subject: Re: ZFS dedup and replication
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Dec 2011 13:15:42 -0000

On 12/01/2011 11:20 AM, krad wrote:
> On 28 November 2011 23:01, Techie <techchavez@gmail.com> wrote:
>
>> Hi all,
>>
>> Is there any plans to implement sharing of the ZFS DDT Dedup table or
>> to make ZFS aware of the destination duplicate blocks on a remote
>> system?
>>
>> >From how I understand it, the zfs send/recv stream does not know about
>> the duplicated blocks on the receiving side when using zfs send -D -i
>> to sendonly incremental changes.
>>
>> So take for example I have an application that I backup each night to
>> a ZFS file system. I want to replicate this every night to my remote
>> site. Each night that I back up I create a tar file on the ZFS data
>> file system. When I go to send an incremental stream it sends the
>> entire tar file to the destination even though over 90% of those
>> blocks already exist at the destination.. Is there any plans to make
>> ZFS aware of what exists already at the destination site to eliminate
>> the need to send duplicate blocks over the wire? zfs send -D I believe
>> only eliminates the duplicate blocks within the stream.
>>
>> Perhaps I am wrong..
>>
>>
>> Thanks
>> Jimmy
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>>
>
> Why tar up the stuff? Just do a zfs snap and then you bypass the whole
> issue?
I was thinking the same thing when I read his message. I don't
understand it either.

On my system with 12 TiB used up, what I do in a script is basically:

-generate a snap name
-make a recursive snapshot
-ssh to the remote server and compare snapshots (find the latest common
snapshot, to find an incremental reference point)
-if a usable reference point exists, start the incremental send like
this (which wipes all changes on the remote system without confirmation):
        zfs send -R -I ${destLastSnap} ${srcLastSnap} | ssh ${destHost}
zfs recv -d -F -v ${destPool}
-and if no usable reference point existed, then do a full send,
non-incremental:
        zfs send -R ${srcLastSnap} | ssh ${destHost} zfs recv -F -v
${destDataSet}


The part about finding the reference snapshot is the most complicated
part of my script, and missing from anything else I found online when I
was looking for a good solution. For example this script:
http://blogs.sun.com/clive/resource/zfs_repl.ksh
found on this page:
http://blogs.oracle.com/clive/entry/replication_using_zfs
was found to be quite terrible, and would fail completely when there was
a new dataset, or a snapshot missing for some reason. So I suggest you
look at that one, but write your own.

The only time my script failed is when there was a zfs bug; the same one
seen here:
http://serverfault.com/questions/66414/cannot-destroy-zfs-snapshot-dataset-already-exists
so I just deleted the clone manually and it worked again.

I thought gzip could save a small amount of time, eg.
I compared speed
of "zfs send ....                           | ssh zfs recv ..."
to "zfs send ... | gzip -c | ssh 'gunzip -c | zfs recv...'"
and found not much or no difference.
But I have no idea why you would use tar.

And just to confirm, I have the same problems with dedup causing severe
bottlenecks on many things, especially zfs recv and scrub, even though I
have 48 GB of memory installed and 44 available to ZFS.

But I find incremental sends to be very efficient, taking much less than
a minute (depending on how much data was changed) when it runs every
hour. And unless your bandwidth is slow and precious, I recommend
sending more than daily, because it is very fast if done often enough. I
send hourly because I didn't have time to work on some scripts to clean
up the old snapshots. Otherwise I would do it every 15 min or maybe 15
seconds ;)

> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


-- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------