Date: Tue, 19 Jan 2016 15:52:23 -0500 From: Kurt Lidl <lidl@pix.net> To: Alan Somers <asomers@freebsd.org> Cc: "src-committers@freebsd.org" <src-committers@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org> Subject: Re: svn commit: r294329 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys Message-ID: <569EA207.5010304@pix.net> In-Reply-To: <CAOtMX2jU3Rm0as2-oTgBt=Xe5_kgneAY7aB_NrjCy%2BaXNHD3KA@mail.gmail.com> References: <201601191700.u0JH0P6k061610@repo.freebsd.org> <CAOtMX2jU3Rm0as2-oTgBt=Xe5_kgneAY7aB_NrjCy%2BaXNHD3KA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/19/16 1:55 PM, Alan Somers wrote: > On Tue, Jan 19, 2016 at 10:00 AM, Alan Somers <asomers@freebsd.org> wrote: >> Author: asomers >> Date: Tue Jan 19 17:00:25 2016 >> New Revision: 294329 >> URL: https://svnweb.freebsd.org/changeset/base/294329 >> >> Log: >> Disallow zvol-backed ZFS pools >> >> Using zvols as backing devices for ZFS pools is fraught with panics and >> deadlocks. For example, attempting to online a missing device in the >> presence of a zvol can cause a panic when vdev_geom tastes the zvol. Better >> to completely disable vdev_geom from ever opening a zvol. The solution >> relies on setting a thread-local variable during vdev_geom_open, and >> returning EOPNOTSUPP during zvol_open if that thread-local variable is set. >> >> Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. Its intent >> was to prevent a recursive mutex acquisition panic. However, the new check >> for the thread-local variable also fixes that problem. >> >> Also, fix a panic in vdev_geom_taste_orphan. For an unknown reason, this >> function was set to panic. But it can occur that a device disappears during >> tasting, and it causes no problems to ignore this departure. >> >> Reviewed by: delphij >> MFC after: 1 week >> Relnotes: yes >> Sponsored by: Spectra Logic Corp >> Differential Revision: https://reviews.freebsd.org/D4986 >> >> Modified: >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c >> >> Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h > > Due to popular demand, I will conditionalize this behavior on a > sysctl, and I won't MFC it. The sysctl must default to off (ZFS on > zvols not allowed) because having the ability to put pools on zvols > can cause panics even for users who aren't using it. Thank you! > And let me clear up some confusion: > > 1) Having the ability to put a zpool on a zvol can cause panics and > deadlocks, even if that ability is unused. > 2) Putting a zpool atop a zvol causes unnecessary performance problems > because there are two layers of COW involved, with all their software > complexities. This also applies to putting a zpool atop files on a > ZFS filesystem. > 3) A VM guest putting a zpool on its virtual disk, where the VM host > backs that virtual disk with a zvol, will work fine. That's the ideal > use case for zvols. > 3b) Using ZFS on both host and guest isn't ideal for performance, as > described in item 2. That's why I prefer to use UFS for VM guests. The patch as is does very much break the way some people do operations on zvols. My script that does virtual machine cloning via snapshots of zvols containing zpools is currently broken due to this. (I upgraded one of my dev hosts right after your commit, to verify the broken behavior.) In my script, I boot an auto-install .iso into bhyve: bhyve -c 2 -m ${vmmem} -H -A -I -g 0 \ -s 0:0,hostbridge \ -s 1,lpc -l com1,stdio \ -s 2:0,virtio-net,${template_tap} \ -s 3:0,ahci-hd,"${zvol}" \ -s 4:0,ahci-cd,"${isofile}" \ ${vmname} || \ echo "trapped error exit from bhyve: $?" So, yes, the zpool gets created by the client VM. Then on the hypervisor host, the script imports that zpool and renames it, so that I can have different pool names for all the client VMs. This step now fails: + zpool import -R /virt/base -d /dev/zvol/zdata sys base cannot import 'sys' as 'base': no such pool or dataset Destroy and re-create the pool from a backup source. I import the clients' zpools after the zpools on them has been renamed, so the hypervisor host can manipulate the files directly. It only disturbs a small amount of the disk blocks on each of the snapshots of the zvol to rename the zpools. In this way, I can instantiate ~30 virtual machines from a custom install.iso image in less than 3 minutes. And the bulk of that time is doing the installation from the custom install.iso into the first virtual machine. The cloning of the zvols, and manipulation of the resulting filesystems is very fast. -Kurt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?569EA207.5010304>