Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 19 Jan 2016 16:02:05 -0500
From:      Nikolai Lifanov <lifanov@mail.lifanov.com>
To:        Kurt Lidl <lidl@pix.net>
Cc:        Alan Somers <asomers@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>
Subject:   Re: svn commit: r294329 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys
Message-ID:  <569EA44D.3070500@mail.lifanov.com>
In-Reply-To: <569EA207.5010304@pix.net>
References:  <201601191700.u0JH0P6k061610@repo.freebsd.org> <CAOtMX2jU3Rm0as2-oTgBt=Xe5_kgneAY7aB_NrjCy%2BaXNHD3KA@mail.gmail.com> <569EA207.5010304@pix.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 01/19/16 15:52, Kurt Lidl wrote:
> On 1/19/16 1:55 PM, Alan Somers wrote:
>> On Tue, Jan 19, 2016 at 10:00 AM, Alan Somers <asomers@freebsd.org>
>> wrote:
>>> Author: asomers
>>> Date: Tue Jan 19 17:00:25 2016
>>> New Revision: 294329
>>> URL: https://svnweb.freebsd.org/changeset/base/294329
>>>
>>> Log:
>>>    Disallow zvol-backed ZFS pools
>>>
>>>    Using zvols as backing devices for ZFS pools is fraught with
>>> panics and
>>>    deadlocks. For example, attempting to online a missing device in the
>>>    presence of a zvol can cause a panic when vdev_geom tastes the
>>> zvol.  Better
>>>    to completely disable vdev_geom from ever opening a zvol. The
>>> solution
>>>    relies on setting a thread-local variable during vdev_geom_open, and
>>>    returning EOPNOTSUPP during zvol_open if that thread-local
>>> variable is set.
>>>
>>>    Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open.
>>> Its intent
>>>    was to prevent a recursive mutex acquisition panic. However, the
>>> new check
>>>    for the thread-local variable also fixes that problem.
>>>
>>>    Also, fix a panic in vdev_geom_taste_orphan. For an unknown
>>> reason, this
>>>    function was set to panic. But it can occur that a device
>>> disappears during
>>>    tasting, and it causes no problems to ignore this departure.
>>>
>>>    Reviewed by:  delphij
>>>    MFC after:    1 week
>>>    Relnotes:     yes
>>>    Sponsored by: Spectra Logic Corp
>>>    Differential Revision:        https://reviews.freebsd.org/D4986
>>>
>>> Modified:
>>>    head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h
>>>    head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
>>>    head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
>>>    head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
>>>
>>> Modified:
>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h
>>
>> Due to popular demand, I will conditionalize this behavior on a
>> sysctl, and I won't MFC it.  The sysctl must default to off (ZFS on
>> zvols not allowed) because having the ability to put pools on zvols
>> can cause panics even for users who aren't using it.
> 
> Thank you!
> 
>> And let me clear up some confusion:
>>
>> 1) Having the ability to put a zpool on a zvol can cause panics and
>> deadlocks, even if that ability is unused.
>> 2) Putting a zpool atop a zvol causes unnecessary performance problems
>> because there are two layers of COW involved, with all their software
>> complexities.  This also applies to putting a zpool atop files on a
>> ZFS filesystem.
>> 3) A VM guest putting a zpool on its virtual disk, where the VM host
>> backs that virtual disk with a zvol, will work fine.  That's the ideal
>> use case for zvols.
>> 3b) Using ZFS on both host and guest isn't ideal for performance, as
>> described in item 2.  That's why I prefer to use UFS for VM guests.
> 
> The patch as is does very much break the way some people do operations
> on zvols.  My script that does virtual machine cloning via snapshots
> of zvols containing zpools is currently broken due to this. (I upgraded
> one of my dev hosts right after your commit, to verify the broken
> behavior.)
> 
> In my script, I boot an auto-install .iso into bhyve:
> 
>         bhyve -c 2 -m ${vmmem} -H -A -I -g 0 \
>                 -s 0:0,hostbridge \
>                 -s 1,lpc -l com1,stdio \
>                 -s 2:0,virtio-net,${template_tap} \
>                 -s 3:0,ahci-hd,"${zvol}" \
>                 -s 4:0,ahci-cd,"${isofile}" \
>                 ${vmname} || \
>                 echo "trapped error exit from bhyve: $?"
> 
> So, yes, the zpool gets created by the client VM.  Then on
> the hypervisor host, the script imports that zpool and renames it,
> so that I can have different pool names for all the client VMs.
> This step now fails:
> 
> + zpool import -R /virt/base -d /dev/zvol/zdata sys base
> cannot import 'sys' as 'base': no such pool or dataset
>     Destroy and re-create the pool from
>     a backup source.
> 
> I import the clients' zpools after the zpools on them has
> been renamed, so the hypervisor host can manipulate the
> files directly.  It only disturbs a small amount of the
> disk blocks on each of the snapshots of the zvol to rename
> the zpools.
> 
> In this way, I can instantiate ~30 virtual machines from
> a custom install.iso image in less than 3 minutes.  And
> the bulk of that time is doing the installation from the
> custom install.iso into the first virtual machine.  The
> cloning of the zvols, and manipulation of the resulting
> filesystems is very fast.
> 

Can't you just set volmode=dev and use zfs clone?

> -Kurt
> 
> 
> 
> _______________________________________________
> svn-src-head@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/svn-src-head
> To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?569EA44D.3070500>