From owner-svn-src-all@freebsd.org Thu Aug 24 19:48:43 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A158CDE6E60; Thu, 24 Aug 2017 19:48:43 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6DDED63CC0; Thu, 24 Aug 2017 19:48:43 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v7OJmghX053469; Thu, 24 Aug 2017 19:48:42 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v7OJmgNR053466; Thu, 24 Aug 2017 19:48:42 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201708241948.v7OJmgNR053466@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Thu, 24 Aug 2017 19:48:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r322854 - in head/cddl: contrib/opensolaris/lib/libzfs/common usr.sbin/zfsd X-SVN-Group: head X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: in head/cddl: contrib/opensolaris/lib/libzfs/common usr.sbin/zfsd X-SVN-Commit-Revision: 322854 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Aug 2017 19:48:43 -0000 Author: asomers Date: Thu Aug 24 19:48:41 2017 New Revision: 322854 URL: https://svnweb.freebsd.org/changeset/base/322854 Log: zfsd(8): Close a race condition when onlining a disk paritition When inserting a partitioned disk, devfs and geom will announce the whole disk before they announce the partition. If the partition containing ZFS extends to one of the disk's extents, then zfsd will see a ZFS label on the whole disk and attempt to online it. ZFS is smart enough to activate the partition instead of the whole disk, but only if GEOM has already created the partition's provider. cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Add a zpool_read_all_labels method. It's similar to zpool_read_label, but it will return the number of labels found. cddl/usr.sbin/zfsd/zfsd_event.cc When processing a DevFS CREATE event, only online a VDEV if we can read all four ZFS labels. Reviewed by: mav MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D11920 Modified: head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c head/cddl/usr.sbin/zfsd/zfsd_event.cc Modified: head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h ============================================================================== --- head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h Thu Aug 24 19:16:25 2017 (r322853) +++ head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h Thu Aug 24 19:48:41 2017 (r322854) @@ -772,6 +772,7 @@ extern int zpool_in_use(libzfs_handle_t *, int, pool_s * Label manipulation. */ extern int zpool_read_label(int, nvlist_t **); +extern int zpool_read_all_labels(int, nvlist_t **); extern int zpool_clear_label(int); /* is this zvol valid for use as a dump device? */ Modified: head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c ============================================================================== --- head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Thu Aug 24 19:16:25 2017 (r322853) +++ head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Thu Aug 24 19:48:41 2017 (r322854) @@ -914,6 +914,65 @@ zpool_read_label(int fd, nvlist_t **config) return (0); } +/* + * Given a file descriptor, read the label information and return an nvlist + * describing the configuration, if there is one. + * returns the number of valid labels found + */ +int +zpool_read_all_labels(int fd, nvlist_t **config) +{ + struct stat64 statbuf; + int l; + vdev_label_t *label; + uint64_t state, txg, size; + int nlabels = 0; + + *config = NULL; + + if (fstat64(fd, &statbuf) == -1) + return (0); + size = P2ALIGN_TYPED(statbuf.st_size, sizeof (vdev_label_t), uint64_t); + + if ((label = malloc(sizeof (vdev_label_t))) == NULL) + return (0); + + for (l = 0; l < VDEV_LABELS; l++) { + nvlist_t *temp = NULL; + + /* TODO: use aio_read so we can read al 4 labels in parallel */ + if (pread64(fd, label, sizeof (vdev_label_t), + label_offset(size, l)) != sizeof (vdev_label_t)) + continue; + + if (nvlist_unpack(label->vl_vdev_phys.vp_nvlist, + sizeof (label->vl_vdev_phys.vp_nvlist), &temp, 0) != 0) + continue; + + if (nvlist_lookup_uint64(temp, ZPOOL_CONFIG_POOL_STATE, + &state) != 0 || state > POOL_STATE_L2CACHE) { + nvlist_free(temp); + temp = NULL; + continue; + } + + if (state != POOL_STATE_SPARE && state != POOL_STATE_L2CACHE && + (nvlist_lookup_uint64(temp, ZPOOL_CONFIG_POOL_TXG, + &txg) != 0 || txg == 0)) { + nvlist_free(temp); + temp = NULL; + continue; + } + if (temp) + *config = temp; + + nlabels++; + } + + free(label); + return (nlabels); +} + typedef struct rdsk_node { char *rn_name; int rn_dfd; Modified: head/cddl/usr.sbin/zfsd/zfsd_event.cc ============================================================================== --- head/cddl/usr.sbin/zfsd/zfsd_event.cc Thu Aug 24 19:16:25 2017 (r322853) +++ head/cddl/usr.sbin/zfsd/zfsd_event.cc Thu Aug 24 19:48:41 2017 (r322854) @@ -36,6 +36,7 @@ #include #include #include +#include #include @@ -93,6 +94,7 @@ DevfsEvent::ReadLabel(int devFd, bool &inUse, bool &de pool_state_t poolState; char *poolName; boolean_t b_inuse; + int nlabels; inUse = false; degraded = false; @@ -105,8 +107,16 @@ DevfsEvent::ReadLabel(int devFd, bool &inUse, bool &de if (poolName != NULL) free(poolName); - if (zpool_read_label(devFd, &devLabel) != 0 - || devLabel == NULL) + nlabels = zpool_read_all_labels(devFd, &devLabel); + /* + * If we find a disk with fewer than the maximum number of + * labels, it might be the whole disk of a partitioned disk + * where ZFS resides on a partition. In that case, we should do + * nothing and wait for the partition to appear. Or, the disk + * might be damaged. In that case, zfsd should do nothing and + * wait for the sysadmin to decide. + */ + if (nlabels != VDEV_LABELS || devLabel == NULL) return (NULL); try {