From owner-svn-src-head@freebsd.org Tue Jan 19 21:10:37 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EF6A1A8815E; Tue, 19 Jan 2016 21:10:37 +0000 (UTC) (envelope-from lifanov@mail.lifanov.com) Received: from mail.lifanov.com (mail.lifanov.com [206.125.175.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D90C11C63; Tue, 19 Jan 2016 21:10:37 +0000 (UTC) (envelope-from lifanov@mail.lifanov.com) Received: by mail.lifanov.com (Postfix, from userid 58) id 6680C108F; Tue, 19 Jan 2016 16:02:12 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.lifanov.com X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,SHORTCIRCUIT shortcircuit=ham autolearn=disabled version=3.4.1 Received: from [127.0.0.1] (vnat600.ejoco.com [166.108.32.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.lifanov.com (Postfix) with ESMTPSA id 3F0D51C6405; Tue, 19 Jan 2016 16:02:08 -0500 (EST) Subject: Re: svn commit: r294329 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys To: Kurt Lidl References: <201601191700.u0JH0P6k061610@repo.freebsd.org> <569EA207.5010304@pix.net> Cc: Alan Somers , "svn-src-head@freebsd.org" , "svn-src-all@freebsd.org" , "src-committers@freebsd.org" From: Nikolai Lifanov Message-ID: <569EA44D.3070500@mail.lifanov.com> Date: Tue, 19 Jan 2016 16:02:05 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <569EA207.5010304@pix.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2016 21:10:38 -0000 On 01/19/16 15:52, Kurt Lidl wrote: > On 1/19/16 1:55 PM, Alan Somers wrote: >> On Tue, Jan 19, 2016 at 10:00 AM, Alan Somers >> wrote: >>> Author: asomers >>> Date: Tue Jan 19 17:00:25 2016 >>> New Revision: 294329 >>> URL: https://svnweb.freebsd.org/changeset/base/294329 >>> >>> Log: >>> Disallow zvol-backed ZFS pools >>> >>> Using zvols as backing devices for ZFS pools is fraught with >>> panics and >>> deadlocks. For example, attempting to online a missing device in the >>> presence of a zvol can cause a panic when vdev_geom tastes the >>> zvol. Better >>> to completely disable vdev_geom from ever opening a zvol. The >>> solution >>> relies on setting a thread-local variable during vdev_geom_open, and >>> returning EOPNOTSUPP during zvol_open if that thread-local >>> variable is set. >>> >>> Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. >>> Its intent >>> was to prevent a recursive mutex acquisition panic. However, the >>> new check >>> for the thread-local variable also fixes that problem. >>> >>> Also, fix a panic in vdev_geom_taste_orphan. For an unknown >>> reason, this >>> function was set to panic. But it can occur that a device >>> disappears during >>> tasting, and it causes no problems to ignore this departure. >>> >>> Reviewed by: delphij >>> MFC after: 1 week >>> Relnotes: yes >>> Sponsored by: Spectra Logic Corp >>> Differential Revision: https://reviews.freebsd.org/D4986 >>> >>> Modified: >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c >>> >>> Modified: >>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h >> >> Due to popular demand, I will conditionalize this behavior on a >> sysctl, and I won't MFC it. The sysctl must default to off (ZFS on >> zvols not allowed) because having the ability to put pools on zvols >> can cause panics even for users who aren't using it. > > Thank you! > >> And let me clear up some confusion: >> >> 1) Having the ability to put a zpool on a zvol can cause panics and >> deadlocks, even if that ability is unused. >> 2) Putting a zpool atop a zvol causes unnecessary performance problems >> because there are two layers of COW involved, with all their software >> complexities. This also applies to putting a zpool atop files on a >> ZFS filesystem. >> 3) A VM guest putting a zpool on its virtual disk, where the VM host >> backs that virtual disk with a zvol, will work fine. That's the ideal >> use case for zvols. >> 3b) Using ZFS on both host and guest isn't ideal for performance, as >> described in item 2. That's why I prefer to use UFS for VM guests. > > The patch as is does very much break the way some people do operations > on zvols. My script that does virtual machine cloning via snapshots > of zvols containing zpools is currently broken due to this. (I upgraded > one of my dev hosts right after your commit, to verify the broken > behavior.) > > In my script, I boot an auto-install .iso into bhyve: > > bhyve -c 2 -m ${vmmem} -H -A -I -g 0 \ > -s 0:0,hostbridge \ > -s 1,lpc -l com1,stdio \ > -s 2:0,virtio-net,${template_tap} \ > -s 3:0,ahci-hd,"${zvol}" \ > -s 4:0,ahci-cd,"${isofile}" \ > ${vmname} || \ > echo "trapped error exit from bhyve: $?" > > So, yes, the zpool gets created by the client VM. Then on > the hypervisor host, the script imports that zpool and renames it, > so that I can have different pool names for all the client VMs. > This step now fails: > > + zpool import -R /virt/base -d /dev/zvol/zdata sys base > cannot import 'sys' as 'base': no such pool or dataset > Destroy and re-create the pool from > a backup source. > > I import the clients' zpools after the zpools on them has > been renamed, so the hypervisor host can manipulate the > files directly. It only disturbs a small amount of the > disk blocks on each of the snapshots of the zvol to rename > the zpools. > > In this way, I can instantiate ~30 virtual machines from > a custom install.iso image in less than 3 minutes. And > the bulk of that time is doing the installation from the > custom install.iso into the first virtual machine. The > cloning of the zvols, and manipulation of the resulting > filesystems is very fast. > Can't you just set volmode=dev and use zfs clone? > -Kurt > > > > _______________________________________________ > svn-src-head@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/svn-src-head > To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org"