From owner-svn-src-all@freebsd.org Tue Jan 26 17:00:32 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BE41CA6EFA4; Tue, 26 Jan 2016 17:00:32 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7FC3FA4D; Tue, 26 Jan 2016 17:00:32 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u0QH0Vgn089477; Tue, 26 Jan 2016 17:00:31 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u0QH0V0Z089475; Tue, 26 Jan 2016 17:00:31 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201601261700.u0QH0V0Z089475@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 26 Jan 2016 17:00:31 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r294843 - stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jan 2016 17:00:32 -0000 Author: asomers Date: Tue Jan 26 17:00:31 2016 New Revision: 294843 URL: https://svnweb.freebsd.org/changeset/base/294843 Log: MFC r292066, r292069, r293708, r294027, and r294358, mostly to vdev_geom.c r292066 | asomers | 2015-12-10 14:46:21 -0700 (Thu, 10 Dec 2015) | 25 lines During vdev_geom_open, require that the vdev guids match the device's label except during split, add, or create operations. This fixes a bug where the wrong disk could be returned, and higher layers of ZFS would immediately eject it again. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c: o When opening by GUID, require both the pool and vdev GUIDs to match. While it is highly unlikely for two vdevs to have the same vdev GUIDs, the ZFS storage pool allocator only guarantees they are unique within a pool. o Modify the open behavior to: - If we are opening a vdev that hasn't previously been opened, open by path without checking GUIDs. - Otherwise, open by path and verify GUIDs. - If that fails, search all geom providers for a device with matching GUIDs. - If that fails, return ENOENT. r292069 | asomers | 2015-12-10 17:04:13 -0700 (Thu, 10 Dec 2015) | 6 lines Change an important error message from ZFS_LOG to printf r293708 | asomers | 2016-01-11 15:15:46 -0700 (Mon, 11 Jan 2016) | 16 lines Fix importing l2arc device by guid After r292066, vdev_geom verifies both the vdev and pool guids of device labels during open. However, spare and l2arc devices don't have pool guids, so opening them by guid will fail (opening by path, when the pathname is known, still succeeds). This change allows a vdev to be opened by guid if the label contains no pool_guid, which is the case for inactive spares and l2arc devices. r294027 | asomers | 2016-01-14 11:19:05 -0700 (Thu, 14 Jan 2016) | 14 lines Fix race condition involving ZFS remove events When a ZFS drive disappears, ZFS sends a resource.fs.zfs.removed event to userland. A userland program like zfsd(8) can use that event, for example to activate a hotspare. The current code contains a race condition: vdev_geom will sent the sysevent _before_ spa.c would update the vdev's status, causing userland processes to see pool state that does not reflect the device removal. This change moves the sysevent to spa.c, closing the race. r294358 | asomers | 2016-01-19 16:16:24 -0700 (Tue, 19 Jan 2016) | 10 lines Quell harmless CID about unchecked return value in nvlist_get_guids. The return value doesn't need to be checked, because nvlist_get_guid's callers check the returned values of the guids. Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Tue Jan 26 16:50:59 2016 (r294842) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Tue Jan 26 17:00:31 2016 (r294843) @@ -5951,6 +5951,8 @@ spa_async_remove(spa_t *spa, vdev_t *vd) vd->vdev_stat.vs_checksum_errors = 0; vdev_state_dirty(vd->vdev_top); + /* Tell userspace that the vdev is gone. */ + zfs_post_remove(spa, vd); } for (int c = 0; c < vd->vdev_children; c++) Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Tue Jan 26 16:50:59 2016 (r294842) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Tue Jan 26 17:00:31 2016 (r294843) @@ -116,7 +116,6 @@ vdev_geom_orphan(struct g_consumer *cp) * async removal support to invoke a close on this * vdev once it is safe to do so. */ - zfs_post_remove(vd->vdev_spa, vd); vd->vdev_remove_wanted = B_TRUE; spa_async_request(vd->vdev_spa, SPA_ASYNC_REMOVE); } @@ -208,14 +207,12 @@ vdev_geom_detach(void *arg, int flag __u } } -static uint64_t -nvlist_get_guid(nvlist_t *list) +static void +nvlist_get_guids(nvlist_t *list, uint64_t *pguid, uint64_t *vguid) { - uint64_t value; - value = 0; - nvlist_lookup_uint64(list, ZPOOL_CONFIG_GUID, &value); - return (value); + (void) nvlist_lookup_uint64(list, ZPOOL_CONFIG_GUID, vguid); + (void) nvlist_lookup_uint64(list, ZPOOL_CONFIG_POOL_GUID, pguid); } static int @@ -270,7 +267,7 @@ vdev_geom_read_config(struct g_consumer size_t buflen; uint64_t psize; off_t offset, size; - uint64_t guid, state, txg; + uint64_t state, txg; int error, l, len; g_topology_assert_not(); @@ -284,7 +281,6 @@ vdev_geom_read_config(struct g_consumer size = sizeof(*label) + pp->sectorsize - ((sizeof(*label) - 1) % pp->sectorsize) - 1; - guid = 0; label = kmem_alloc(size, KM_SLEEP); buflen = sizeof(label->vl_vdev_phys.vp_nvlist); @@ -479,30 +475,29 @@ vdev_geom_read_pool_label(const char *na return (*count > 0 ? 0 : ENOENT); } -static uint64_t -vdev_geom_read_guid(struct g_consumer *cp) +static void +vdev_geom_read_guids(struct g_consumer *cp, uint64_t *pguid, uint64_t *vguid) { nvlist_t *config; - uint64_t guid; g_topology_assert_not(); - guid = 0; + *pguid = 0; + *vguid = 0; if (vdev_geom_read_config(cp, &config) == 0) { - guid = nvlist_get_guid(config); + nvlist_get_guids(config, pguid, vguid); nvlist_free(config); } - return (guid); } static struct g_consumer * -vdev_geom_attach_by_guid(uint64_t guid) +vdev_geom_attach_by_guids(uint64_t pool_guid, uint64_t vdev_guid) { struct g_class *mp; struct g_geom *gp, *zgp; struct g_provider *pp; struct g_consumer *cp, *zcp; - uint64_t pguid; + uint64_t pguid, vguid; g_topology_assert(); @@ -522,15 +517,24 @@ vdev_geom_attach_by_guid(uint64_t guid) if (vdev_geom_attach_taster(zcp, pp) != 0) continue; g_topology_unlock(); - pguid = vdev_geom_read_guid(zcp); + vdev_geom_read_guids(zcp, &pguid, &vguid); g_topology_lock(); vdev_geom_detach_taster(zcp); - if (pguid != guid) + /* + * Check that the label's vdev guid matches the + * desired guid. If the label has a pool guid, + * check that it matches too. (Inactive spares + * and L2ARCs do not have any pool guid in the + * label.) + */ + if ((pguid != 0 && + pguid != pool_guid) || + vguid != vdev_guid) continue; cp = vdev_geom_attach(pp); if (cp == NULL) { - printf("ZFS WARNING: Unable to attach to %s.\n", - pp->name); + printf("ZFS WARNING: Unable to " + "attach to %s.\n", pp->name); continue; } break; @@ -548,7 +552,7 @@ end: } static struct g_consumer * -vdev_geom_open_by_guid(vdev_t *vd) +vdev_geom_open_by_guids(vdev_t *vd) { struct g_consumer *cp; char *buf; @@ -557,7 +561,7 @@ vdev_geom_open_by_guid(vdev_t *vd) g_topology_assert(); ZFS_LOG(1, "Searching by guid [%ju].", (uintmax_t)vd->vdev_guid); - cp = vdev_geom_attach_by_guid(vd->vdev_guid); + cp = vdev_geom_attach_by_guids(spa_guid(vd->vdev_spa), vd->vdev_guid); if (cp != NULL) { len = strlen(cp->provider->name) + strlen("/dev/") + 1; buf = kmem_alloc(len, KM_SLEEP); @@ -566,10 +570,12 @@ vdev_geom_open_by_guid(vdev_t *vd) spa_strfree(vd->vdev_path); vd->vdev_path = buf; - ZFS_LOG(1, "Attach by guid [%ju] succeeded, provider %s.", + ZFS_LOG(1, "Attach by guid [%ju:%ju] succeeded, provider %s.", + (uintmax_t)spa_guid(vd->vdev_spa), (uintmax_t)vd->vdev_guid, vd->vdev_path); } else { - ZFS_LOG(1, "Search by guid [%ju] failed.", + ZFS_LOG(1, "Search by guid [%ju:%ju] failed.", + (uintmax_t)spa_guid(vd->vdev_spa), (uintmax_t)vd->vdev_guid); } @@ -581,7 +587,7 @@ vdev_geom_open_by_path(vdev_t *vd, int c { struct g_provider *pp; struct g_consumer *cp; - uint64_t guid; + uint64_t pguid, vguid; g_topology_assert(); @@ -593,14 +599,17 @@ vdev_geom_open_by_path(vdev_t *vd, int c if (cp != NULL && check_guid && ISP2(pp->sectorsize) && pp->sectorsize <= VDEV_PAD_SIZE) { g_topology_unlock(); - guid = vdev_geom_read_guid(cp); + vdev_geom_read_guids(cp, &pguid, &vguid); g_topology_lock(); - if (guid != vd->vdev_guid) { + if (pguid != spa_guid(vd->vdev_spa) || + vguid != vd->vdev_guid) { vdev_geom_detach(cp, 0); cp = NULL; ZFS_LOG(1, "guid mismatch for provider %s: " - "%ju != %ju.", vd->vdev_path, - (uintmax_t)vd->vdev_guid, (uintmax_t)guid); + "%ju:%ju != %ju:%ju.", vd->vdev_path, + (uintmax_t)spa_guid(vd->vdev_spa), + (uintmax_t)vd->vdev_guid, + (uintmax_t)pguid, (uintmax_t)vguid); } else { ZFS_LOG(1, "guid match for provider %s.", vd->vdev_path); @@ -634,23 +643,38 @@ vdev_geom_open(vdev_t *vd, uint64_t *psi g_topology_lock(); error = 0; - /* - * If we're creating or splitting a pool, just find the GEOM provider - * by its name and ignore GUID mismatches. - */ - if (vd->vdev_spa->spa_load_state == SPA_LOAD_NONE || - vd->vdev_spa->spa_splitting_newspa == B_TRUE) + if (vd->vdev_spa->spa_splitting_newspa || + (vd->vdev_prevstate == VDEV_STATE_UNKNOWN && + vd->vdev_spa->spa_load_state == SPA_LOAD_NONE)) { + /* + * We are dealing with a vdev that hasn't been previously + * opened (since boot), and we are not loading an + * existing pool configuration. This looks like a + * vdev add operation to a new or existing pool. + * Assume the user knows what he/she is doing and find + * GEOM provider by its name, ignoring GUID mismatches. + * + * XXPOLICY: It would be safer to only allow a device + * that is unlabeled or labeled but missing + * GUID information to be opened in this fashion, + * unless we are doing a split, in which case we + * should allow any guid. + */ cp = vdev_geom_open_by_path(vd, 0); - else { + } else { + /* + * Try using the recorded path for this device, but only + * accept it if its label data contains the expected GUIDs. + */ cp = vdev_geom_open_by_path(vd, 1); if (cp == NULL) { /* * The device at vd->vdev_path doesn't have the - * expected guid. The disks might have merely + * expected GUIDs. The disks might have merely * moved around so try all other GEOM providers - * to find one with the right guid. + * to find one with the right GUIDs. */ - cp = vdev_geom_open_by_guid(vd); + cp = vdev_geom_open_by_guids(vd); } }