Date: Tue, 19 Jan 2016 16:02:05 -0500 From: Nikolai Lifanov <lifanov@mail.lifanov.com> To: Kurt Lidl <lidl@pix.net> Cc: Alan Somers <asomers@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org> Subject: Re: svn commit: r294329 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys Message-ID: <569EA44D.3070500@mail.lifanov.com> In-Reply-To: <569EA207.5010304@pix.net> References: <201601191700.u0JH0P6k061610@repo.freebsd.org> <CAOtMX2jU3Rm0as2-oTgBt=Xe5_kgneAY7aB_NrjCy%2BaXNHD3KA@mail.gmail.com> <569EA207.5010304@pix.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 01/19/16 15:52, Kurt Lidl wrote: > On 1/19/16 1:55 PM, Alan Somers wrote: >> On Tue, Jan 19, 2016 at 10:00 AM, Alan Somers <asomers@freebsd.org> >> wrote: >>> Author: asomers >>> Date: Tue Jan 19 17:00:25 2016 >>> New Revision: 294329 >>> URL: https://svnweb.freebsd.org/changeset/base/294329 >>> >>> Log: >>> Disallow zvol-backed ZFS pools >>> >>> Using zvols as backing devices for ZFS pools is fraught with >>> panics and >>> deadlocks. For example, attempting to online a missing device in the >>> presence of a zvol can cause a panic when vdev_geom tastes the >>> zvol. Better >>> to completely disable vdev_geom from ever opening a zvol. The >>> solution >>> relies on setting a thread-local variable during vdev_geom_open, and >>> returning EOPNOTSUPP during zvol_open if that thread-local >>> variable is set. >>> >>> Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. >>> Its intent >>> was to prevent a recursive mutex acquisition panic. However, the >>> new check >>> for the thread-local variable also fixes that problem. >>> >>> Also, fix a panic in vdev_geom_taste_orphan. For an unknown >>> reason, this >>> function was set to panic. But it can occur that a device >>> disappears during >>> tasting, and it causes no problems to ignore this departure. >>> >>> Reviewed by: delphij >>> MFC after: 1 week >>> Relnotes: yes >>> Sponsored by: Spectra Logic Corp >>> Differential Revision: https://reviews.freebsd.org/D4986 >>> >>> Modified: >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c >>> >>> Modified: >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h >> >> Due to popular demand, I will conditionalize this behavior on a >> sysctl, and I won't MFC it. The sysctl must default to off (ZFS on >> zvols not allowed) because having the ability to put pools on zvols >> can cause panics even for users who aren't using it. > > Thank you! > >> And let me clear up some confusion: >> >> 1) Having the ability to put a zpool on a zvol can cause panics and >> deadlocks, even if that ability is unused. >> 2) Putting a zpool atop a zvol causes unnecessary performance problems >> because there are two layers of COW involved, with all their software >> complexities. This also applies to putting a zpool atop files on a >> ZFS filesystem. >> 3) A VM guest putting a zpool on its virtual disk, where the VM host >> backs that virtual disk with a zvol, will work fine. That's the ideal >> use case for zvols. >> 3b) Using ZFS on both host and guest isn't ideal for performance, as >> described in item 2. That's why I prefer to use UFS for VM guests. > > The patch as is does very much break the way some people do operations > on zvols. My script that does virtual machine cloning via snapshots > of zvols containing zpools is currently broken due to this. (I upgraded > one of my dev hosts right after your commit, to verify the broken > behavior.) > > In my script, I boot an auto-install .iso into bhyve: > > bhyve -c 2 -m ${vmmem} -H -A -I -g 0 \ > -s 0:0,hostbridge \ > -s 1,lpc -l com1,stdio \ > -s 2:0,virtio-net,${template_tap} \ > -s 3:0,ahci-hd,"${zvol}" \ > -s 4:0,ahci-cd,"${isofile}" \ > ${vmname} || \ > echo "trapped error exit from bhyve: $?" > > So, yes, the zpool gets created by the client VM. Then on > the hypervisor host, the script imports that zpool and renames it, > so that I can have different pool names for all the client VMs. > This step now fails: > > + zpool import -R /virt/base -d /dev/zvol/zdata sys base > cannot import 'sys' as 'base': no such pool or dataset > Destroy and re-create the pool from > a backup source. > > I import the clients' zpools after the zpools on them has > been renamed, so the hypervisor host can manipulate the > files directly. It only disturbs a small amount of the > disk blocks on each of the snapshots of the zvol to rename > the zpools. > > In this way, I can instantiate ~30 virtual machines from > a custom install.iso image in less than 3 minutes. And > the bulk of that time is doing the installation from the > custom install.iso into the first virtual machine. The > cloning of the zvols, and manipulation of the resulting > filesystems is very fast. > Can't you just set volmode=dev and use zfs clone? > -Kurt > > > > _______________________________________________ > svn-src-head@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/svn-src-head > To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?569EA44D.3070500>