Date: Tue, 19 Jan 2016 16:25:23 -0500 From: Nikolai Lifanov <lifanov@mail.lifanov.com> To: Kurt Lidl <lidl@pix.net> Cc: "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, Alan Somers <asomers@freebsd.org> Subject: Re: svn commit: r294329 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys Message-ID: <569EA9C3.5040003@mail.lifanov.com> In-Reply-To: <569EA44D.3070500@mail.lifanov.com> References: <201601191700.u0JH0P6k061610@repo.freebsd.org> <CAOtMX2jU3Rm0as2-oTgBt=Xe5_kgneAY7aB_NrjCy%2BaXNHD3KA@mail.gmail.com> <569EA207.5010304@pix.net> <569EA44D.3070500@mail.lifanov.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 01/19/16 16:02, Nikolai Lifanov wrote: > On 01/19/16 15:52, Kurt Lidl wrote: >> On 1/19/16 1:55 PM, Alan Somers wrote: >>> On Tue, Jan 19, 2016 at 10:00 AM, Alan Somers <asomers@freebsd.org> >>> wrote: >>>> Author: asomers >>>> Date: Tue Jan 19 17:00:25 2016 >>>> New Revision: 294329 >>>> URL: https://svnweb.freebsd.org/changeset/base/294329 >>>> >>>> Log: >>>> Disallow zvol-backed ZFS pools >>>> >>>> Using zvols as backing devices for ZFS pools is fraught with >>>> panics and >>>> deadlocks. For example, attempting to online a missing device in the >>>> presence of a zvol can cause a panic when vdev_geom tastes the >>>> zvol. Better >>>> to completely disable vdev_geom from ever opening a zvol. The >>>> solution >>>> relies on setting a thread-local variable during vdev_geom_open, and >>>> returning EOPNOTSUPP during zvol_open if that thread-local >>>> variable is set. >>>> >>>> Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. >>>> Its intent >>>> was to prevent a recursive mutex acquisition panic. However, the >>>> new check >>>> for the thread-local variable also fixes that problem. >>>> >>>> Also, fix a panic in vdev_geom_taste_orphan. For an unknown >>>> reason, this >>>> function was set to panic. But it can occur that a device >>>> disappears during >>>> tasting, and it causes no problems to ignore this departure. >>>> >>>> Reviewed by: delphij >>>> MFC after: 1 week >>>> Relnotes: yes >>>> Sponsored by: Spectra Logic Corp >>>> Differential Revision: https://reviews.freebsd.org/D4986 >>>> >>>> Modified: >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c >>>> >>>> Modified: >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h >>> >>> Due to popular demand, I will conditionalize this behavior on a >>> sysctl, and I won't MFC it. The sysctl must default to off (ZFS on >>> zvols not allowed) because having the ability to put pools on zvols >>> can cause panics even for users who aren't using it. >> >> Thank you! >> >>> And let me clear up some confusion: >>> >>> 1) Having the ability to put a zpool on a zvol can cause panics and >>> deadlocks, even if that ability is unused. >>> 2) Putting a zpool atop a zvol causes unnecessary performance problems >>> because there are two layers of COW involved, with all their software >>> complexities. This also applies to putting a zpool atop files on a >>> ZFS filesystem. >>> 3) A VM guest putting a zpool on its virtual disk, where the VM host >>> backs that virtual disk with a zvol, will work fine. That's the ideal >>> use case for zvols. >>> 3b) Using ZFS on both host and guest isn't ideal for performance, as >>> described in item 2. That's why I prefer to use UFS for VM guests. >> >> The patch as is does very much break the way some people do operations >> on zvols. My script that does virtual machine cloning via snapshots >> of zvols containing zpools is currently broken due to this. (I upgraded >> one of my dev hosts right after your commit, to verify the broken >> behavior.) >> >> In my script, I boot an auto-install .iso into bhyve: >> >> bhyve -c 2 -m ${vmmem} -H -A -I -g 0 \ >> -s 0:0,hostbridge \ >> -s 1,lpc -l com1,stdio \ >> -s 2:0,virtio-net,${template_tap} \ >> -s 3:0,ahci-hd,"${zvol}" \ >> -s 4:0,ahci-cd,"${isofile}" \ >> ${vmname} || \ >> echo "trapped error exit from bhyve: $?" >> >> So, yes, the zpool gets created by the client VM. Then on >> the hypervisor host, the script imports that zpool and renames it, >> so that I can have different pool names for all the client VMs. >> This step now fails: >> >> + zpool import -R /virt/base -d /dev/zvol/zdata sys base >> cannot import 'sys' as 'base': no such pool or dataset >> Destroy and re-create the pool from >> a backup source. >> >> I import the clients' zpools after the zpools on them has >> been renamed, so the hypervisor host can manipulate the >> files directly. It only disturbs a small amount of the >> disk blocks on each of the snapshots of the zvol to rename >> the zpools. >> >> In this way, I can instantiate ~30 virtual machines from >> a custom install.iso image in less than 3 minutes. And >> the bulk of that time is doing the installation from the >> custom install.iso into the first virtual machine. The >> cloning of the zvols, and manipulation of the resulting >> filesystems is very fast. >> > > Can't you just set volmode=dev and use zfs clone? > Never mind, you want different pool names to manipulate files directly. >> -Kurt >> >> >> >> _______________________________________________ >> svn-src-head@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/svn-src-head >> To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org" > > _______________________________________________ > svn-src-head@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/svn-src-head > To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?569EA9C3.5040003>