Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Feb 2018 02:21:03 +0000 (UTC)
From:      Alexander Motin <mav@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-vendor@freebsd.org
Subject:   svn commit: r329793 - in vendor/illumos/dist: cmd/zpool lib/libzfs/common lib/libzpool/common lib/libzpool/common/sys
Message-ID:  <201802220221.w1M2L3dj065445@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: mav
Date: Thu Feb 22 02:21:03 2018
New Revision: 329793
URL: https://svnweb.freebsd.org/changeset/base/329793

Log:
  9075 Improve ZFS pool import/load process and corrupted pool recovery
  
  illumos/illumos-gate@6f7938128a2c5e23f4b970ea101137eadd1470a1
  
  Some work has been done lately to improve the debugability of the ZFS pool
  load (and import) process. This includes:
  
  https://www.illumos.org/issues/7638: Refactor spa_load_impl into several functions
  https://www.illumos.org/issues/8961: SPA load/import should tell us why it failed
  https://www.illumos.org/issues/7277: zdb should be able to print zfs_dbgmsg's
  
  To iterate on top of that, there's a few changes that were made to make the
  import process more resilient and crash free. One of the first tasks during the
  pool load process is to parse a config provided from userland that describes
  what devices the pool is composed of. A vdev tree is generated from that config,
  and then all the vdevs are opened.
  
  The Meta Object Set (MOS) of the pool is accessed, and several metadata objects
  that are necessary to load the pool are read. The exact configuration of the
  pool is also stored inside the MOS. Since the configuration provided from
  userland is external and might not accurately describe the vdev tree
  of the pool at the txg that is being loaded, it cannot be relied upon to safely
  operate the pool. For that reason, the configuration in the MOS is read early
  on. In the past, the two configurations were compared together and if there was
  a mismatch then the load process was aborted and an error was returned.
  
  The latter was a good way to ensure a pool does not get corrupted, however it
  made the pool load process needlessly fragile in cases where the vdev
  configuration changed or the userland configuration was outdated. Since the MOS
  is stored in 3 copies, the configuration provided by userland doesn't have to be
  perfect in order to read its contents. Hence, a new approach has been adopted:
  The pool is first opened with the untrusted userland configuration just so that
  the real configuration can be read from the MOS. The trusted MOS configuration
  is then used to generate a new vdev tree and the pool is re-opened.
  
  When the pool is opened with an untrusted configuration, writes are disabled
  to avoid accidentally damaging it. During reads, some sanity checks are
  performed on block pointers to see if each DVA points to a known vdev;
  when the configuration is untrusted, instead of panicking the system if those
  checks fail we simply avoid issuing reads to the invalid DVAs.
  
  This new two-step pool load process now allows rewinding pools accross
  vdev tree changes such as device replacement, addition, etc. Loading a pool
  from an external config file in a clustering environment also becomes much
  safer now since the pool will import even if the config is outdated and didn't,
  for instance, register a recent device addition.
  
  With this code in place, it became relatively easy to implement a
  long-sought-after feature: the ability to import a pool with missing top level
  (i.e. non-redundant) devices. Note that since this almost guarantees some loss
  Of data, this feature is for now restricted to a read-only import.
  
  Reviewed by: George Wilson <george.wilson@delphix.com>
  Reviewed by: Matthew Ahrens <mahrens@delphix.com>
  Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
  Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
  Author: Pavel Zakharov <pavel.zakharov@delphix.com>

Modified:
  vendor/illumos/dist/cmd/zpool/zpool_main.c
  vendor/illumos/dist/lib/libzfs/common/libzfs.h
  vendor/illumos/dist/lib/libzfs/common/libzfs_import.c
  vendor/illumos/dist/lib/libzfs/common/libzfs_pool.c
  vendor/illumos/dist/lib/libzpool/common/kernel.c
  vendor/illumos/dist/lib/libzpool/common/sys/zfs_context.h

Modified: vendor/illumos/dist/cmd/zpool/zpool_main.c
==============================================================================
--- vendor/illumos/dist/cmd/zpool/zpool_main.c	Thu Feb 22 02:16:44 2018	(r329792)
+++ vendor/illumos/dist/cmd/zpool/zpool_main.c	Thu Feb 22 02:21:03 2018	(r329793)
@@ -1562,6 +1562,10 @@ print_status_config(zpool_handle_t *zhp, const char *n
 			(void) printf(gettext("split into new pool"));
 			break;
 
+		case VDEV_AUX_CHILDREN_OFFLINE:
+			(void) printf(gettext("all children offline"));
+			break;
+
 		default:
 			(void) printf(gettext("corrupted data"));
 			break;
@@ -1649,6 +1653,10 @@ print_import_config(const char *name, nvlist_t *nv, in
 			(void) printf(gettext("too many errors"));
 			break;
 
+		case VDEV_AUX_CHILDREN_OFFLINE:
+			(void) printf(gettext("all children offline"));
+			break;
+
 		default:
 			(void) printf(gettext("corrupted data"));
 			break;
@@ -2296,6 +2304,7 @@ zpool_do_import(int argc, char **argv)
 	idata.poolname = searchname;
 	idata.guid = searchguid;
 	idata.cachefile = cachefile;
+	idata.policy = policy;
 
 	pools = zpool_search_import(g_zfs, &idata);
 

Modified: vendor/illumos/dist/lib/libzfs/common/libzfs.h
==============================================================================
--- vendor/illumos/dist/lib/libzfs/common/libzfs.h	Thu Feb 22 02:16:44 2018	(r329792)
+++ vendor/illumos/dist/lib/libzfs/common/libzfs.h	Thu Feb 22 02:21:03 2018	(r329793)
@@ -388,6 +388,7 @@ typedef struct importargs {
 	int can_be_active : 1;	/* can the pool be active?		*/
 	int unique : 1;		/* does 'poolname' already exist?	*/
 	int exists : 1;		/* set on return if pool already exists	*/
+	nvlist_t *policy;	/* rewind policy (rewind txg, etc.)	*/
 } importargs_t;
 
 extern nvlist_t *zpool_search_import(libzfs_handle_t *, importargs_t *);

Modified: vendor/illumos/dist/lib/libzfs/common/libzfs_import.c
==============================================================================
--- vendor/illumos/dist/lib/libzfs/common/libzfs_import.c	Thu Feb 22 02:16:44 2018	(r329792)
+++ vendor/illumos/dist/lib/libzfs/common/libzfs_import.c	Thu Feb 22 02:21:03 2018	(r329793)
@@ -412,7 +412,8 @@ vdev_is_hole(uint64_t *hole_array, uint_t holes, uint_
  * return to the user.
  */
 static nvlist_t *
-get_configs(libzfs_handle_t *hdl, pool_list_t *pl, boolean_t active_ok)
+get_configs(libzfs_handle_t *hdl, pool_list_t *pl, boolean_t active_ok,
+    nvlist_t *policy)
 {
 	pool_entry_t *pe;
 	vdev_entry_t *ve;
@@ -746,6 +747,12 @@ get_configs(libzfs_handle_t *hdl, pool_list_t *pl, boo
 			continue;
 		}
 
+		if (policy != NULL) {
+			if (nvlist_add_nvlist(config, ZPOOL_REWIND_POLICY,
+			    policy) != 0)
+				goto nomem;
+		}
+
 		if ((nvl = refresh_config(hdl, config)) == NULL) {
 			nvlist_free(config);
 			config = NULL;
@@ -1251,7 +1258,7 @@ zpool_find_import_impl(libzfs_handle_t *hdl, importarg
 			goto error;
 	}
 
-	ret = get_configs(hdl, &pools, iarg->can_be_active);
+	ret = get_configs(hdl, &pools, iarg->can_be_active, iarg->policy);
 
 error:
 	for (pe = pools.pools; pe != NULL; pe = penext) {
@@ -1380,6 +1387,14 @@ zpool_find_import_cached(libzfs_handle_t *hdl, const c
 
 		if (active)
 			continue;
+
+		if (nvlist_add_string(src, ZPOOL_CONFIG_CACHEFILE,
+		    cachefile) != 0) {
+			(void) no_memory(hdl);
+			nvlist_free(raw);
+			nvlist_free(pools);
+			return (NULL);
+		}
 
 		if ((dst = refresh_config(hdl, src)) == NULL) {
 			nvlist_free(raw);

Modified: vendor/illumos/dist/lib/libzfs/common/libzfs_pool.c
==============================================================================
--- vendor/illumos/dist/lib/libzfs/common/libzfs_pool.c	Thu Feb 22 02:16:44 2018	(r329792)
+++ vendor/illumos/dist/lib/libzfs/common/libzfs_pool.c	Thu Feb 22 02:21:03 2018	(r329793)
@@ -1808,8 +1808,9 @@ zpool_import_props(libzfs_handle_t *hdl, nvlist_t *con
 			    nvlist_lookup_nvlist(nvinfo,
 			    ZPOOL_CONFIG_MISSING_DEVICES, &missing) == 0) {
 				(void) printf(dgettext(TEXT_DOMAIN,
-				    "The devices below are missing, use "
-				    "'-m' to import the pool anyway:\n"));
+				    "The devices below are missing or "
+				    "corrupted, use '-m' to import the pool "
+				    "anyway:\n"));
 				print_vdev_tree(hdl, NULL, missing, 2);
 				(void) printf("\n");
 			}

Modified: vendor/illumos/dist/lib/libzpool/common/kernel.c
==============================================================================
--- vendor/illumos/dist/lib/libzpool/common/kernel.c	Thu Feb 22 02:16:44 2018	(r329792)
+++ vendor/illumos/dist/lib/libzpool/common/kernel.c	Thu Feb 22 02:21:03 2018	(r329793)
@@ -461,6 +461,16 @@ kernel_fini(void)
 	system_taskq_fini();
 }
 
+/* ARGSUSED */
+uint32_t
+zone_get_hostid(void *zonep)
+{
+	/*
+	 * We're emulating the system's hostid in userland.
+	 */
+	return (strtoul(hw_serial, NULL, 10));
+}
+
 int
 z_uncompress(void *dst, size_t *dstlen, const void *src, size_t srclen)
 {

Modified: vendor/illumos/dist/lib/libzpool/common/sys/zfs_context.h
==============================================================================
--- vendor/illumos/dist/lib/libzpool/common/sys/zfs_context.h	Thu Feb 22 02:16:44 2018	(r329792)
+++ vendor/illumos/dist/lib/libzpool/common/sys/zfs_context.h	Thu Feb 22 02:21:03 2018	(r329793)
@@ -317,6 +317,7 @@ typedef struct callb_cpr {
 
 #define	zone_dataset_visible(x, y)	(1)
 #define	INGLOBALZONE(z)			(1)
+extern uint32_t zone_get_hostid(void *zonep);
 
 extern int zfs_secpolicy_snapshot_perms(const char *name, cred_t *cr);
 extern int zfs_secpolicy_rename_perms(const char *from, const char *to,



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201802220221.w1M2L3dj065445>