From owner-freebsd-fs@FreeBSD.ORG Thu Nov 24 23:58:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5ED66106564A for ; Thu, 24 Nov 2011 23:58:48 +0000 (UTC) (envelope-from areilly@bigpond.net.au) Received: from nskntmtas02p.mx.bigpond.com (nskntmtas02p.mx.bigpond.com [61.9.168.140]) by mx1.freebsd.org (Postfix) with ESMTP id DFCE18FC12 for ; Thu, 24 Nov 2011 23:58:47 +0000 (UTC) Received: from nskntcmgw06p ([61.9.169.166]) by nskntmtas02p.mx.bigpond.com with ESMTP id <20111124235845.GGSI3825.nskntmtas02p.mx.bigpond.com@nskntcmgw06p>; Thu, 24 Nov 2011 23:58:45 +0000 Received: from johnny.reilly.home ([124.188.161.100]) by nskntcmgw06p with BigPond Outbound id 1Byj1i0032AGJ5o01Byl77; Thu, 24 Nov 2011 23:58:45 +0000 X-Authority-Analysis: v=2.0 cv=aYnjbGUt c=1 sm=1 a=+rWFdGQzZE3xDYVtG1Y/Og==:17 a=z1TLwsU0kBEA:10 a=kj9zAlcOel0A:10 a=mIXDIbH41VYKOlUu_hEA:9 a=jRxk7AjWHwIE8fAZRqQA:7 a=CjuIK1q_8ugA:10 a=+rWFdGQzZE3xDYVtG1Y/Og==:117 Date: Fri, 25 Nov 2011 10:58:43 +1100 From: Andrew Reilly To: Johannes Totz Message-ID: <20111124235843.GB96603@johnny.reilly.home> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-fs@freebsd.org Subject: Re: backing up zfs dataset X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Nov 2011 23:58:48 -0000 On Fri, Nov 11, 2011 at 04:36:29PM +0000, Johannes Totz wrote: > To back up a zfs dataset there are a few possibilities: > 1) rsync file data to another machine > 2) zfs-send to another machine, into a zfs dataset > 3) zfs-send to another machine, dumping the stream to a file Backing up to another machine clearly has advantages, but it's not the only way. I use ZFS send/receive to backup to a removable drive on the same machine, and it seems to be working quite nicely. > The first one works alright but you loose admin info, properties set on > the dataset, etc > The second is prefered but requires another machine which runs zfs. Method 2 does require the receiver to be running zfs, but it seems to be the only good way. Well, if you want to remote-backup to a machine not running ZFS then rsync from a zfs snapshot should get most of the way there. Modern rsync is pretty good about metadata and sparse files, I think. (Having said that, I haven't used that method for a long time.) > The third is bad. Tell me about it! I was saving zfs send streams to a UFS disk as a backkup strategy for ages. When I needed it, I discovered that there is no recovery or redundancy in zfs receive: it just stopped, unable to read the filesystem. Toast. > So far I have been doing (3), for daily short-term backups, works, > tested, everything is peachy. However, I dont like it anymore for > obvious reasons. Ideally, I would like to go with (2). But I dont have > another zfs-capable machine, or the machine that I would like to backup > onto will not ever run zfs. FWIW, my backup script is now based on: zfs snapshot tank/${1}@0 || error "problem making new snapshot on $1" zfs send -i tank/${1}@1 tank/${1}@0 | zfs recv -vF bkp2pool/$1 || error "problem sending incremental fs packet on $1" with appropriate snapshot rotation and checking scripts before and after. Seems to be working OK, and I really like the confidence that being able to wander around inside the readable backups provides. I do think that backup is something of a weakness for ZFS at the moment. Sure, live filesystems and snapshots are clearly cool, and the modern way and all, but there is an awful lot of flexibility and ease of undersanding in the model of a "backup file on a tape." Doesn't have to be on a tape, but the moral equivalent to dump/restore would (in my book) be a wonderful addition to ZFS, if anyone felt inclined. Just padding the send/receive serialisation format with enough checksum and restart information to allow detection and graceful recovery from read errors in the backup medium would do the job. Cheers, -- Andrew