From owner-svn-src-all@FreeBSD.ORG Fri Jan 2 17:54:48 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1B271E78; Fri, 2 Jan 2015 17:54:48 +0000 (UTC) Received: from anubis.delphij.net (anubis.delphij.net [64.62.153.212]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "anubis.delphij.net", Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id F22E966653; Fri, 2 Jan 2015 17:54:47 +0000 (UTC) Received: from zeta.ixsystems.com (unknown [12.229.62.2]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by anubis.delphij.net (Postfix) with ESMTPSA id B82121757E; Fri, 2 Jan 2015 09:54:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphij.net; s=anubis; t=1420221281; x=1420235681; bh=3P/pjBpXTn9onY+CAyW0r/wDUfzUuFIbJ/BIg0jBg60=; h=Date:From:Reply-To:To:Subject:References:In-Reply-To; b=jO817TEaEhTWP7kblN39vWhmi8rGqSLe7gKXh+zGvOA6WlbCo1/PpZatDicaIf562 S6eiGgmcYw8UcvXORso4/tUCkFfmttkR/34NSAsxnNCYkg+Dwe15rWfIaBBVy5QgiO B2lLKhvF8N3L40t83swaXxf9+iaq8c3/u5Yf5ZQw= Message-ID: <54A6DB61.9060607@delphij.net> Date: Fri, 02 Jan 2015 09:54:41 -0800 From: Xin Li Reply-To: d@delphij.net Organization: The FreeBSD Project MIME-Version: 1.0 To: Steven Hartland , d@delphij.net, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r276123 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs References: <201412230931.sBN9VPMK017968@svn.freebsd.org> <54A35B88.9090102@delphij.net> <54A39153.8040905@freebsd.org> <54A3ACEF.70905@delphij.net> <54A5AC21.5070802@multiplay.co.uk> In-Reply-To: <54A5AC21.5070802@multiplay.co.uk> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jan 2015 17:54:48 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 01/01/15 12:20, Steven Hartland wrote: > > On 31/12/2014 07:59, Xin Li wrote: >> Hi, Steven, >> >> On 12/30/14 22:01, Steven Hartland wrote: >>> On 31/12/2014 02:12, Xin Li wrote: >>>> On 12/23/14 01:31, Steven Hartland wrote: >>>>> Author: smh Date: Tue Dec 23 09:31:24 2014 New Revision: >>>>> 276123 URL: >>>>> https://svnweb.freebsd.org/changeset/base/276123 >>>>> >>>>> Log: Always sync the global ZFS config cache to reflect the >>>>> new mosconfig This fixes out of date zpool.cache for root >>>>> pools, which can cause issues such as confusion of zdb >>>>> etc. MFC after: 1 month >>>>> >>>>> Modified: >>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>>> >>>>> >>>>> Modified: >>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>>> >>>>> ============================================================================== >>>>> >>>>> >>>>> --- >>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>>> >>>>> Tue Dec 23 08:51:30 2014 (r276122) >>>>> +++ >>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>>> >>>>> Tue Dec 23 09:31:24 2014 (r276123) >>>>> @@ -536,8 +536,7 @@ spa_config_update(spa_t *spa, int >>>>> what) /* * Update the global config cache to reflect the >>>>> new mosconfig. */ - if (!spa->spa_is_root) - >>>>> spa_config_sync(spa, B_FALSE, what != >>>>> SPA_CONFIG_UPDATE_POOL); + spa_config_sync(spa, B_FALSE, >>>>> what != SPA_CONFIG_UPDATE_POOL); if (what == >>>>> SPA_CONFIG_UPDATE_POOL) spa_config_update(spa, >>>>> SPA_CONFIG_UPDATE_VDEVS); >>>> It seems like that this change breaks systems where not all >>>> pools are available (e.g. some of pools are encrypted) at >>>> boot time, by removing all these pools from the cache file. >>>> >>>> As a result, on the next boot, these pools would not be >>>> imported even when the devices are available (geli attached), >>>> and reverting this change would restore the system to its >>>> previous behavior. >>>> >>>> Perhaps it have exposed an existing bug? >>> I've managed to reproduce this here with mdX backed test pools, >>> and looking into it this is due to spa_config_sync: >>> >>> /* * Skip over our own pool if we're about to remove * >>> ourselves from the spa namespace or any pool that * is >>> readonly. Since we cannot guarantee that a * readonly pool >>> would successfully import upon reboot, * we don't allow them to >>> be written to the cache file. */ if ((spa == target && >>> removing) || !spa_writeable(spa)) { continue; } >>> >>> This was added by upstream: >>> https://github.com/illumos/illumos-gate/commit/fb02ae025247e3b662600e5a9c1b4c33ecab7d72 >>> >>> >>> >>> https://www.illumos.org/issues/3639 >>> >>> It seems the desired behavior was to exclude read only pools, >>> to prevent them being mounted writable on reboot, but the check >>> is too wide and also excludes unavailable pools too. >>> >>> The attached patch corrects the test to only check writable on >>> active pools so I believe should fix the issue your seeing. >>> >>> If you can test it and lmk that would be great. >> This is an improvement (the attaching of pool is now working) but >> the cached configuration is somewhat corrupted, it shows many >> lines of vdev_stats[] contents. >> >> I'll take a look at the code tomorrow and see if I can do some >> instruments and possibly figure out the underlying issue. >> >> Cheers, > Looks like the spa_config_sync makes the assumption that the last > call to spa_config_generate was done with getstats = 0. > > This is now an invalid assumption so we need to ensure that stats > are removed from the config nvl before persisting to disk. > > The attached patch, which supersedes the last one I sent, fixes > that in my tests here. > > Let me know if it fixes your test case too? Yes, this fixes the problem, thanks for working on it. I'm not fully convinced with the approach though -- shouldn't we update the cachefile (again) when the pool become available? Cheers, - -- Xin LI https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.1.1 (FreeBSD) iQIcBAEBCgAGBQJUptteAAoJEJW2GBstM+nsrKIQAIGex0+PbAXgMzXgKh4dB+gw ZX2VN6XBFYsG0Eu7m0ZNDv5cdYayLEM5gL2RXd7XTLssDXFi7TecWOmrZydXawES vhc8uvjFboZN8jxlKBw+AL3gPaRx4IY05pX36DBo4Rnj+b2+A7HZFegdPUPrVMwS LZRk+ABrE3dobVu3QYtjBjNc+U6sAgDRJyewGslLQNITrwNyWXRM5O+NjGOitY3a +xlhmyuod2zNddz14oIDcJ/3Y7NUNlvSkldmLJhBnaX9WvLSVvBlK8J5fsZpwDAI TJ6kzUd97C4A7t5twX88zIHWhZMFqxsHXACQkpY5Uq3JIp5VVwPHLbuh7THmKi2e IiGVynGT9Agt0XwgeCQRRvD048cHX0MMnSIIo7VB1eXihMTXvcVxsZnaGS16lO5U H8uhegeB9EbFx/lcg+dGpVmbpjqQ+jbrxu0wBx1Gfj0KHe4t/ajkdTnUQuNpUV+o r/8R2uL1H33dYjDtJDRPTOJEttNDcssZSt7EM1Rp1UdlYpWgXVP/wiKf5o4Ad9yZ /H2PTczcHSqOUEvXFfLUmy+T0u1htHCzi6FqI+wBh9PMBbDtZhetWTfVbjy4GIQr AzmGESjzrj97WQWum7ArmtuWYLnOfbsAtgj8XO1q94OfYoaTAkreOQndym/l+lRG mfScLpmdOJtY+ghicS6l =97mC -----END PGP SIGNATURE-----