From owner-freebsd-fs@FreeBSD.ORG Mon Jul 7 17:40:25 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E299710656BE for ; Mon, 7 Jul 2008 17:40:25 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (merlin.alerce.com [64.62.142.94]) by mx1.freebsd.org (Postfix) with ESMTP id 40E038FC16 for ; Mon, 7 Jul 2008 17:40:25 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (localhost [127.0.0.1]) by merlin.alerce.com (Postfix) with ESMTP id 88B5633C62 for ; Mon, 7 Jul 2008 10:19:13 -0700 (PDT) Received: from postfix.alerce.com (w092.z064001164.sjc-ca.dsl.cnc.net [64.1.164.92]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by merlin.alerce.com (Postfix) with ESMTP id 2BAFC33C5B for ; Mon, 7 Jul 2008 10:19:13 -0700 (PDT) Received: by postfix.alerce.com (Postfix, from userid 501) id 9C84C465668; Mon, 7 Jul 2008 10:18:52 -0700 (PDT) From: George Hartzell MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18546.20476.590665.29995@almost.alerce.com> Date: Mon, 7 Jul 2008 10:18:52 -0700 To: freebsd-fs@freebsd.org X-Mailer: VM 7.19 under Emacs 22.1.50.1 X-Virus-Scanned: ClamAV using ClamSMTP Subject: using zfs and unionfs together, does zfs need to be extended? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: hartzell@alerce.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Jul 2008 17:40:26 -0000 I'd like to be able to set up a large-ish number of very similar jails, with a minimum of fuss and take advantage of zfs' cool features. I'd like to use unionfs to do this, but zfs' lack of whiteout support seems to make it impossible. [jump to the bottom if you want to skip the setup and get to the questions] It seems like the most popular way to set up jails these days uses read-only nullfs mounts of a base system and symbolic links into a read-write nullfs mount for each jail's specific stuff (etc, /usr/local, etc...). These approaches are well described in: http://erdgeist.org/arts/software/ezjail http://www.freebsd.org/doc/en/books/handbook/jails-application.html and they work fine with zfs based storage. It's also possible to use unionfs to layer jail-specific storage over a base system. While this approach gives more per-jail flexibility and avoids having to relocate various directories in the base system, various unionfs problems seem to have pushed it out of favor. The ongoing work of daichi@freebsd.org et al. that fixes various problems with unionfs, http://people.freebsd.org/~daichi/unionfs/ makes it look as if this approach might be now be safe, using something like: mount -t unionfs -o below,noatime /usr/jails/base /usr/jails/www The obvious zfs analog to this: mount -t unionfs -o below,noatime /tank/jails/base /tank/jails/www fails with: mount_unionfs: /tank/jails/www: Operation not supported A bit of digging suggests that the mount fails when the unionfs code checks to see if /tank/jails/www supports whiteouts. The fact that this check doesn't occur if the uniondir is read-only provides a way to superficially check if whiteouts are the only problem, this: mount -t unionfs -o ro,below,noatime /tank/jails/base /tank/jails/www does indeed seem to lead to a working [albeit read-only] union mount. One can work around the problem by creating a ZFS volume, building a UFS filesystem on it, and then using that as the uniondir, e.g.: zfs create -V 5G tank/jail/vol1 newfs /dev/zvol/tank/jail/vol1 mkdir /usr/jail/zvol-www mount /dev/zvol/tank/jail/vol1 /usr/jail/zvol-www/ mount -t unionfs -o below,noatime /tank/jail/base/ /usr/jail/zvol-www The upper layer is still [presumably, I haven't tested these yet] snapshot-able, send-able, etc.... but this approach leaves me with a bunch of UFS filesystems that need care and feeding (fsck, etc...). So finally, the question: How hard would it be to add whiteout support to our ZFS? Is it "just" a matter of understanding the places in the UFS code that do whiteout things, locating the analogous places in the ZFS tree and doing similar things (it seems to be a "simple" matter of creating/destroying a whiteout vnode when necessary and checking for it when appropriate) or is there something fundamentally harder about it? Has anyone already done it? If it were doable/done cleanly, might it get committed? Thanks, g.