From owner-freebsd-fs@FreeBSD.ORG Thu Dec 1 16:42:55 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 837BA106566B for ; Thu, 1 Dec 2011 16:42:55 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 410D68FC0C for ; Thu, 1 Dec 2011 16:42:55 +0000 (UTC) Received: by ywp17 with SMTP id 17so3093683ywp.13 for ; Thu, 01 Dec 2011 08:42:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LNZJycUjaUobWf311XcpMHgLfeXgzFrn9nSrTa28Nws=; b=D05LnwLUJUepbw7wcCIRmzcYtJKhC6J/HEcwTI6ai3EGKPQDK/HDrgfaFW+M5u66ID v4Q34UkZnS8iPCEYQ4gW5Z/43V5DmT7FjkEEoBY2J2EsR3GHAVyBWHYmmZkmd2ytk1H+ o4ovHOJIOvP/kDoysfHGxL9Gfop2EyAXXrgcE= MIME-Version: 1.0 Received: by 10.236.192.233 with SMTP id i69mr13065899yhn.60.1322757774657; Thu, 01 Dec 2011 08:42:54 -0800 (PST) Received: by 10.236.95.41 with HTTP; Thu, 1 Dec 2011 08:42:54 -0800 (PST) In-Reply-To: <4ED77B09.1090709@brockmann-consult.de> References: <4ED77B09.1090709@brockmann-consult.de> Date: Thu, 1 Dec 2011 16:42:54 +0000 Message-ID: From: krad To: Peter Maloney Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 16:42:55 -0000 On 1 December 2011 13:03, Peter Maloney wrote: > On 12/01/2011 11:20 AM, krad wrote: > > On 28 November 2011 23:01, Techie wrote: > > > >> Hi all, > >> > >> Is there any plans to implement sharing of the ZFS DDT Dedup table or > >> to make ZFS aware of the destination duplicate blocks on a remote > >> system? > >> > >> >From how I understand it, the zfs send/recv stream does not know about > >> the duplicated blocks on the receiving side when using zfs send -D -i > >> to sendonly incremental changes. > >> > >> So take for example I have an application that I backup each night to > >> a ZFS file system. I want to replicate this every night to my remote > >> site. Each night that I back up I create a tar file on the ZFS data > >> file system. When I go to send an incremental stream it sends the > >> entire tar file to the destination even though over 90% of those > >> blocks already exist at the destination.. Is there any plans to make > >> ZFS aware of what exists already at the destination site to eliminate > >> the need to send duplicate blocks over the wire? zfs send -D I believe > >> only eliminates the duplicate blocks within the stream. > >> > >> Perhaps I am wrong.. > >> > >> > >> Thanks > >> Jimmy > >> _______________________________________________ > >> freebsd-fs@freebsd.org mailing list > >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs > >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > >> > > > > Why tar up the stuff? Just do a zfs snap and then you bypass the whole > > issue? > I was thinking the same thing when I read his message. I don't > understand it either. > > On my system with 12 TiB used up, what I do in a script is basically: > > -generate a snap name > -make a recursive snapshot > -ssh to the remote server and compare snapshots (find the latest common > snapshot, to find an incremental reference point) > -if a usable reference point exists, start the incremental send like > this (which wipes all changes on the remote system without confirmation): > zfs send -R -I ${destLastSnap} ${srcLastSnap} | ssh ${destHost} > zfs recv -d -F -v ${destPool} > -and if no usable reference point existed, then do a full send, > non-incremental: > zfs send -R ${srcLastSnap} | ssh ${destHost} zfs recv -F -v > ${destDataSet} > > > The part about finding the reference snapshot is the most complicated > part of my script, and missing from anything else I found online when I > was looking for a good solution. For example this script: > http://blogs.sun.com/clive/resource/zfs_repl.ksh > found on this page: > http://blogs.oracle.com/clive/entry/replication_using_zfs > was found to be quite terrible, and would fail completely when there was > a new dataset, or a snapshot missing for some reason. So I suggest you > look at that one, but write your own. > > The only time my script failed is when there was a zfs bug; the same one > seen here: > > http://serverfault.com/questions/66414/cannot-destroy-zfs-snapshot-dataset-already-exists > so I just deleted the clone manually and it worked again. > > I thought gzip could save a small amount of time, eg. > I compared speed > of "zfs send .... | ssh zfs recv ..." > to "zfs send ... | gzip -c | ssh 'gunzip -c | zfs recv...'" > and found not much or no difference. > But I have no idea why you would use tar. > > And just to confirm, I have the same problems with dedup causing severe > bottlenecks on many things, especially zfs recv and scrub, even though I > have 48 GB of memory installed and 44 available to ZFS. > > But I find incremental sends to be very efficient, taking much less than > a minute (depending on how much data was changed) when it runs every > hour. And unless your bandwidth is slow and precious, I recommend > sending more than daily, because it is very fast if done often enough. I > send hourly because I didn't have time to work on some scripts to clean > up the old snapshots. Otherwise I would do it every 15 min or maybe 15 > seconds ;) > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > -- > > -------------------------------------------- > Peter Maloney > Brockmann Consult > Max-Planck-Str. 2 > 21502 Geesthacht > Germany > Tel: +49 4152 889 300 > Fax: +49 4152 889 333 > E-mail: peter.maloney@brockmann-consult.de > Internet: http://www.brockmann-consult.de > -------------------------------------------- > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > sounds like we have been through very similar experiences