From owner-freebsd-stable@FreeBSD.ORG Tue Jul 15 14:54:26 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E12A51065670 for ; Tue, 15 Jul 2008 14:54:26 +0000 (UTC) (envelope-from jdc@parodius.com) Received: from mx01.sc1.parodius.com (mx01.sc1.parodius.com [72.20.106.3]) by mx1.freebsd.org (Postfix) with ESMTP id B51748FC0C for ; Tue, 15 Jul 2008 14:54:26 +0000 (UTC) (envelope-from jdc@parodius.com) Received: by mx01.sc1.parodius.com (Postfix, from userid 1000) id ADEEE1CC09A; Tue, 15 Jul 2008 07:54:26 -0700 (PDT) Date: Tue, 15 Jul 2008 07:54:26 -0700 From: Jeremy Chadwick To: Sven Willenberger Message-ID: <20080715145426.GA31340@eos.sc1.parodius.com> References: <1216130834.27608.27.camel@lanshark.dmv.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1216130834.27608.27.camel@lanshark.dmv.com> User-Agent: Mutt/1.5.18 (2008-05-17) Cc: freebsd-stable@freebsd.org Subject: Re: Multi-machine mirroring choices X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Jul 2008 14:54:27 -0000 On Tue, Jul 15, 2008 at 10:07:14AM -0400, Sven Willenberger wrote: > 3) The send/recv feature of zfs was something I had not even considered > until very recently. My understanding is that this would work by a) > taking a snapshot of master_data1 b) zfs sending that snapshot to > slave_data1 c) via ssh on pipe, receiving that snapshot on slave_data1 > and then d) doing incremental snapshots, sending, receiving as in > (a)(b)(c). How time/cpu intensive is the snapshot generation and just > how granular could this be done? I would imagine for systems with litle > traffic/changes this could be practical but what about systems that may > see a lot of files added, modified, deleted to the filesystem(s)? I can speak a bit on ZFS snapshots, because I've used them in the past with good results. Compared to UFS2 snapshots (e.g. dump -L or mksnap_ffs), ZFS snapshots are fantastic. The two main positives for me were: 1) ZFS snapshots take significantly less time to create; I'm talking seconds or minutes vs. 30-45 minutes. I also remember receiving mail from someone (on -hackers? I can't remember -- let me know and I can dig through my mail archives for the specific mail/details) stating something along the lines of "over time, yes, UFS2 snapshots take longer and longer, it's a known design problem". 2) ZFS snapshots, when created, do not cause the system to more or less deadlock until the snapshot is generated; you can continue to use the system during the time the snapshot is being generated. While with UFS2, dump -L and mksnap_ffs will surely disappoint you. We moved all of our production systems off of using dump/restore solely because of these aspects. We didn't move to ZFS though; we went with rsync, which is great, except for the fact that it modifies file atimes (hope you use Maildir and not classic mbox/mail spools...). ZFS's send/recv capability (over a network) is something I didn't have time to experiment with, but it looked *very* promising. The method is documented in the manpage as "Example 12", and is very simple -- as it should be. You don't have to use SSH either, by the way[1]. One of the "annoyances" to ZFS snapshots, however, was that I had to write my own script to do snapshot rotations (think incremental dump(8) but using ZFS snapshots). > I would be interested to hear anyone's experience with any (or all) of > these methods and caveats of each. I am leaning towards ggate(dc) + > zpool at the moment assuming that zfs can "smartly" rebuild the mirror > after the slave's ggated processes bug out. I don't have any experience with GEOM gate, so I can't comment on it. But I would highly recommend you discuss the shortcomings with pjd@, because he definitely listens. However, I must ask you this: why are you doing things the way you are? Why are you using the equivalent of RAID 1 but for entire computers? Is there some reason you aren't using a filer (e.g. NetApp) for your data, thus keeping it centralised? There has been recent discussion of using FreeBSD with ZFS as such, over on freebsd-fs. If you want a link to the thread, I can point you to it. I'd like to know why you're doing things the way you are. By knowing why, possibly myself or others could recommend solving the problem in a different way -- one that doesn't involve realtime duplication of filesystems via network. [1]: If you're transferring huge sums of data over a secure link (read: dedicated gigE LAN or a separate VLAN), you'll be disappointed to find that there is no Cipher=none with stock SSH; the closest you'll get is blowfish-cbc. You might be saddened by the fact that the only way you'll get Cipher=none is via the HPN patches, which means you'll be forced to install ports/security/openssh-portable. (I am not a fan of the "overwrite the base system" concept; it's a hack, and I'd rather get rid of the whole "base system" concept in general -- but that's for another discussion). My point is, your overall network I/O will be limited by SSH, so if you're pushing lots of data across a LAN, consider something without encryption. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |