Date: Wed, 3 Jul 2013 10:59:07 +0200 From: Markus Gebert <markus.gebert@hostpoint.ch> To: Kevin Day <toasty@dragondata.com> Cc: freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? Message-ID: <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> In-Reply-To: <A5A66641-5EF9-454E-A767-009480EE404E@dragondata.com> References: <87li5o5tz2.wl%berend@pobox.com> <CA%2BtpaK1jQuKneQsxkVfxJGzXdPdLZfqBM1QWQ0e19nK5t71t1Q@mail.gmail.com> <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <CADBaqmihCB5JP01hLwXTWHoZiJJ5-jkT-Ro=oDwOcKZT_zvEKA@mail.gmail.com> <A5A66641-5EF9-454E-A767-009480EE404E@dragondata.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 03.07.2013, at 09:02, Kevin Day <toasty@dragondata.com> wrote: >=20 > On Jul 3, 2013, at 1:53 AM, Will Andrews <will@firepipe.net> wrote: >=20 >> On Wednesday, July 3, 2013, Kevin Day wrote: >> The closest thing we can do in FreeBSD is to unmount the filesystem, = take the snapshot, and remount. This has the side effect of closing all = open files, so it's not really an alternative. >>=20 >> The other option is to not freeze the filesystem before taking the = snapshot, but again you risk leaving things in an inconsistent state, = and/or the last few writes you think you made didn't actually get = committed to disk yet. For automated systems that create then clone = filesystems for new VMs, this can be a big problem. At best, you're = going to get a warning that the filesystem wasn't cleanly unmounted. >>=20 >> Actually, sync(2)/sync(8) will do the job on ZFS. It won't stop/pause = I/O running in other contexts, but it does guarantee that any commands = you ran and completed prior to calling sync will make it to disk in ZFS. >>=20 >> This is because sync in ZFS is implemented as a ZIL commit, so = transactions that haven't yet made it to disk via the normal syncing = context will at least be committed via their ZIL blocks. Which can then = be replayed when the pool is imported later, in this case from the EBS = snapshots. >>=20 >> And since the entire tree from the =FCberblock down in ZFS is COW, = you can't get an inconsistent pool simply by doing a virtual disk = snapshot, regardless of how that is implemented. >>=20 >> --Will. >=20 > Sorry, yes, this is true. We're not using ZFS to clone and provision = new VMs, so I was just thinking about UFS here. And ZFS does have a good = advantage here that it seems to actually respect sync requests. I think = it was here I reported a few months ago that we were seeing UFS+SUJ not = actually doing anything when sync(8) was called. >=20 > But for some workloads this still isn't sufficient if you have = processes running that could be writing at any time. As an example, we = have a database server using ZFS backed storage. Short of shutting down = the server, there's no way to guarantee it won't try to write even if we = lock all tables, disconnect all clients, etc. mysql has all sorts of = things done on timers that occur lazily in the future, including = periodic checkpoint writes even if there is no activity. >=20 > I know this is a sort of obscure use case, but Linux and Windows both = have this functionality that VMWare will use if present (and the guest = tools know about it). Linux goes a step further and ensures that it's = not in the middle of writing anything to swap during the quiesce period, = too. I don't think this would be terribly difficult to implement, a hook = somewhere along the write chain that blocks (or queues up) anything = trying to write until the unfreeze comes along, but I'm guessing there = are all sorts of deadlock opportunities here. Indeed sync(8) has the disadvantage that you cannot prevent writes = between the syscall and the EBS snapshot, so depending on the = application, this can make the resulting EBS snapshot useless. But taking a zfs snapshot is an atomic operation. Why not use that? For = example: 1. snapshot the zfs at the same point in time you'd issue that ioctl on = Linux 2. take the EBS snapshot at any time 3. clone the EBS snapshot to the new/other VM 4. zfs import the pool there 5. zfs rollback the filesystem to the snapshot taken in step 1 (or clone = it and use that) Any writes that have been issued between the zfs snapshot and the EBS = snapshot are discarded, and like that you get the exact same filesystem = data as you would have gotten with ioctl. Also, taking the zfs snapshot = should take much less time, because you don't have to wait for the EBS = snapshot to complete before you can resume IO on the filesystem. So you = don't even depend on EBS snapshots being quick when using the zfs = approach, a big advantage in my opinion. Markus
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14A2336A-969C-4A13-9EFA-C0C42A12039F>