From owner-svn-src-all@FreeBSD.ORG Thu Jan 1 20:21:02 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 10E0B57F for ; Thu, 1 Jan 2015 20:21:02 +0000 (UTC) Received: from mail-we0-f181.google.com (mail-we0-f181.google.com [74.125.82.181]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8C59B66560 for ; Thu, 1 Jan 2015 20:21:01 +0000 (UTC) Received: by mail-we0-f181.google.com with SMTP id q58so3757029wes.40 for ; Thu, 01 Jan 2015 12:20:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type; bh=VG+wSgJGHTTt/iB29GOL1pk0Hsp8+1ugLxXst/aY3Xs=; b=l+e+OP0zKp4d4/JxL8pPgpO+2IB9ZeOs7ssAq8zZ/Z4Q6WNu+CRdpCzbcmEVLVsXh4 D5zbeYCXfjBwQAfK5SgjePQ3DAwzeTXRW3aFysHSt5ahCkH4UbgsBlZs8Qs7QHRUPwzP uXXrOMtL8qMGzcLd3mM/AOaZZooaB3YXTb+ThDDS9asc5CqTfVONyNqYpeVSiONvIlBN XW86ydpGutt4wCJaaKqTVCAhYXdcga6Jyp9eU4G4Vc4QyR7TvvACgseP+Y5Yp8kxmkfN ydI1Dv8NmyI5su+dkeJkIzwM2uQs6NS2RfjA9NIjnx4LaIcwcTNTTRJpD1F6jfpGxD0s IPtA== X-Gm-Message-State: ALoCoQmJVRdLbp5Se02Ruxjk5z/YrwydwjFDHGY/Vb7UFNlXttuu4n5gdNrpp/YgcXdBbx6C4sHE X-Received: by 10.194.189.138 with SMTP id gi10mr147062767wjc.86.1420143659458; Thu, 01 Jan 2015 12:20:59 -0800 (PST) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id nj9sm51787558wic.10.2015.01.01.12.20.58 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 01 Jan 2015 12:20:58 -0800 (PST) Message-ID: <54A5AC21.5070802@multiplay.co.uk> Date: Thu, 01 Jan 2015 20:20:49 +0000 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Xin Li , d@delphij.net, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r276123 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs References: <201412230931.sBN9VPMK017968@svn.freebsd.org> <54A35B88.9090102@delphij.net> <54A39153.8040905@freebsd.org> <54A3ACEF.70905@delphij.net> In-Reply-To: <54A3ACEF.70905@delphij.net> Content-Type: multipart/mixed; boundary="------------070607090801080704000405" X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jan 2015 20:21:02 -0000 This is a multi-part message in MIME format. --------------070607090801080704000405 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit On 31/12/2014 07:59, Xin Li wrote: > Hi, Steven, > > On 12/30/14 22:01, Steven Hartland wrote: >> On 31/12/2014 02:12, Xin Li wrote: >>> On 12/23/14 01:31, Steven Hartland wrote: >>>> Author: smh >>>> Date: Tue Dec 23 09:31:24 2014 >>>> New Revision: 276123 >>>> URL: https://svnweb.freebsd.org/changeset/base/276123 >>>> >>>> Log: >>>> Always sync the global ZFS config cache to reflect the new mosconfig >>>> This fixes out of date zpool.cache for root pools, which can >>>> cause issues >>>> such as confusion of zdb etc. >>>> MFC after: 1 month >>>> >>>> Modified: >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>> >>>> Modified: >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>> ============================================================================== >>>> >>>> --- >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>> Tue Dec 23 08:51:30 2014 (r276122) >>>> +++ >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>> Tue Dec 23 09:31:24 2014 (r276123) >>>> @@ -536,8 +536,7 @@ spa_config_update(spa_t *spa, int what) >>>> /* >>>> * Update the global config cache to reflect the new mosconfig. >>>> */ >>>> - if (!spa->spa_is_root) >>>> - spa_config_sync(spa, B_FALSE, what != SPA_CONFIG_UPDATE_POOL); >>>> + spa_config_sync(spa, B_FALSE, what != SPA_CONFIG_UPDATE_POOL); >>>> if (what == SPA_CONFIG_UPDATE_POOL) >>>> spa_config_update(spa, SPA_CONFIG_UPDATE_VDEVS); >>> It seems like that this change breaks systems where not all pools are >>> available (e.g. some of pools are encrypted) at boot time, by removing >>> all these pools from the cache file. >>> >>> As a result, on the next boot, these pools would not be imported even >>> when the devices are available (geli attached), and reverting this >>> change would restore the system to its previous behavior. >>> >>> Perhaps it have exposed an existing bug? >> I've managed to reproduce this here with mdX backed test pools, and >> looking into it this is due to spa_config_sync: >> >> /* >> * Skip over our own pool if we're about to remove >> * ourselves from the spa namespace or any pool >> that >> * is readonly. Since we cannot guarantee that a >> * readonly pool would successfully import upon >> reboot, >> * we don't allow them to be written to the >> cache file. >> */ >> if ((spa == target && removing) || >> !spa_writeable(spa)) { >> continue; >> } >> >> This was added by upstream: >> https://github.com/illumos/illumos-gate/commit/fb02ae025247e3b662600e5a9c1b4c33ecab7d72 >> >> https://www.illumos.org/issues/3639 >> >> It seems the desired behavior was to exclude read only pools, to prevent >> them being mounted writable on reboot, but the check is too wide and >> also excludes unavailable pools too. >> >> The attached patch corrects the test to only check writable on active >> pools so I believe should fix the issue your seeing. >> >> If you can test it and lmk that would be great. > This is an improvement (the attaching of pool is now working) but the > cached configuration is somewhat corrupted, it shows many lines of > vdev_stats[] contents. > > I'll take a look at the code tomorrow and see if I can do some > instruments and possibly figure out the underlying issue. > > Cheers, Looks like the spa_config_sync makes the assumption that the last call to spa_config_generate was done with getstats = 0. This is now an invalid assumption so we need to ensure that stats are removed from the config nvl before persisting to disk. The attached patch, which supersedes the last one I sent, fixes that in my tests here. Let me know if it fixes your test case too? Regards Steve --------------070607090801080704000405 Content-Type: text/plain; charset=windows-1252; name="zpool-cache.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="zpool-cache.patch" Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c =================================================================== --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c (revision 276123) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c (working copy) @@ -140,6 +140,26 @@ out: kobj_close_file(file); } +static void +spa_config_clean(nvlist_t *nvl) +{ + nvlist_t **child; + nvlist_t *nvroot = NULL; + uint_t c, children; + + if (nvlist_lookup_nvlist_array(nvl, ZPOOL_CONFIG_CHILDREN, &child, + &children) == 0) { + for (c = 0; c < children; c++) + spa_config_clean(child[c]); + } + + if (nvlist_lookup_nvlist(nvl, ZPOOL_CONFIG_VDEV_TREE, &nvroot) == 0) + spa_config_clean(nvroot); + + nvlist_remove(nvl, ZPOOL_CONFIG_VDEV_STATS, DATA_TYPE_UINT64_ARRAY); + nvlist_remove(nvl, ZPOOL_CONFIG_SCAN_STATS, DATA_TYPE_UINT64_ARRAY); +} + static int spa_config_write(spa_config_dirent_t *dp, nvlist_t *nvl) { @@ -233,6 +253,7 @@ spa_config_sync(spa_t *target, boolean_t removing, */ nvl = NULL; while ((spa = spa_next(spa)) != NULL) { + nvlist_t *nvroot = NULL; /* * Skip over our own pool if we're about to remove * ourselves from the spa namespace or any pool that @@ -241,7 +262,8 @@ spa_config_sync(spa_t *target, boolean_t removing, * we don't allow them to be written to the cache file. */ if ((spa == target && removing) || - !spa_writeable(spa)) + (spa_state(spa) == POOL_STATE_ACTIVE && + !spa_writeable(spa))) continue; mutex_enter(&spa->spa_props_lock); @@ -260,6 +282,9 @@ spa_config_sync(spa_t *target, boolean_t removing, VERIFY(nvlist_add_nvlist(nvl, spa->spa_name, spa->spa_config) == 0); mutex_exit(&spa->spa_props_lock); + + if (nvlist_lookup_nvlist(nvl, spa->spa_name, &nvroot) == 0) + spa_config_clean(nvroot); } error = spa_config_write(dp, nvl); --------------070607090801080704000405--