Date: Fri, 02 Jan 2015 09:54:41 -0800 From: Xin Li <delphij@delphij.net> To: Steven Hartland <steven@multiplay.co.uk>, d@delphij.net, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r276123 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs Message-ID: <54A6DB61.9060607@delphij.net> In-Reply-To: <54A5AC21.5070802@multiplay.co.uk> References: <201412230931.sBN9VPMK017968@svn.freebsd.org> <54A35B88.9090102@delphij.net> <54A39153.8040905@freebsd.org> <54A3ACEF.70905@delphij.net> <54A5AC21.5070802@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 01/01/15 12:20, Steven Hartland wrote: > > On 31/12/2014 07:59, Xin Li wrote: >> Hi, Steven, >> >> On 12/30/14 22:01, Steven Hartland wrote: >>> On 31/12/2014 02:12, Xin Li wrote: >>>> On 12/23/14 01:31, Steven Hartland wrote: >>>>> Author: smh Date: Tue Dec 23 09:31:24 2014 New Revision: >>>>> 276123 URL: >>>>> https://svnweb.freebsd.org/changeset/base/276123 >>>>> >>>>> Log: Always sync the global ZFS config cache to reflect the >>>>> new mosconfig This fixes out of date zpool.cache for root >>>>> pools, which can cause issues such as confusion of zdb >>>>> etc. MFC after: 1 month >>>>> >>>>> Modified: >>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>>> >>>>> >>>>> Modified: >>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>>> >>>>> ============================================================================== >>>>> >>>>> >>>>> --- >>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>>> >>>>> Tue Dec 23 08:51:30 2014 (r276122) >>>>> +++ >>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c >>>>> >>>>> Tue Dec 23 09:31:24 2014 (r276123) >>>>> @@ -536,8 +536,7 @@ spa_config_update(spa_t *spa, int >>>>> what) /* * Update the global config cache to reflect the >>>>> new mosconfig. */ - if (!spa->spa_is_root) - >>>>> spa_config_sync(spa, B_FALSE, what != >>>>> SPA_CONFIG_UPDATE_POOL); + spa_config_sync(spa, B_FALSE, >>>>> what != SPA_CONFIG_UPDATE_POOL); if (what == >>>>> SPA_CONFIG_UPDATE_POOL) spa_config_update(spa, >>>>> SPA_CONFIG_UPDATE_VDEVS); >>>> It seems like that this change breaks systems where not all >>>> pools are available (e.g. some of pools are encrypted) at >>>> boot time, by removing all these pools from the cache file. >>>> >>>> As a result, on the next boot, these pools would not be >>>> imported even when the devices are available (geli attached), >>>> and reverting this change would restore the system to its >>>> previous behavior. >>>> >>>> Perhaps it have exposed an existing bug? >>> I've managed to reproduce this here with mdX backed test pools, >>> and looking into it this is due to spa_config_sync: >>> >>> /* * Skip over our own pool if we're about to remove * >>> ourselves from the spa namespace or any pool that * is >>> readonly. Since we cannot guarantee that a * readonly pool >>> would successfully import upon reboot, * we don't allow them to >>> be written to the cache file. */ if ((spa == target && >>> removing) || !spa_writeable(spa)) { continue; } >>> >>> This was added by upstream: >>> https://github.com/illumos/illumos-gate/commit/fb02ae025247e3b662600e5a9c1b4c33ecab7d72 >>> >>> >>> >>> https://www.illumos.org/issues/3639 >>> >>> It seems the desired behavior was to exclude read only pools, >>> to prevent them being mounted writable on reboot, but the check >>> is too wide and also excludes unavailable pools too. >>> >>> The attached patch corrects the test to only check writable on >>> active pools so I believe should fix the issue your seeing. >>> >>> If you can test it and lmk that would be great. >> This is an improvement (the attaching of pool is now working) but >> the cached configuration is somewhat corrupted, it shows many >> lines of vdev_stats[] contents. >> >> I'll take a look at the code tomorrow and see if I can do some >> instruments and possibly figure out the underlying issue. >> >> Cheers, > Looks like the spa_config_sync makes the assumption that the last > call to spa_config_generate was done with getstats = 0. > > This is now an invalid assumption so we need to ensure that stats > are removed from the config nvl before persisting to disk. > > The attached patch, which supersedes the last one I sent, fixes > that in my tests here. > > Let me know if it fixes your test case too? Yes, this fixes the problem, thanks for working on it. I'm not fully convinced with the approach though -- shouldn't we update the cachefile (again) when the pool become available? Cheers, - -- Xin LI <delphij@delphij.net> https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.1.1 (FreeBSD) iQIcBAEBCgAGBQJUptteAAoJEJW2GBstM+nsrKIQAIGex0+PbAXgMzXgKh4dB+gw ZX2VN6XBFYsG0Eu7m0ZNDv5cdYayLEM5gL2RXd7XTLssDXFi7TecWOmrZydXawES vhc8uvjFboZN8jxlKBw+AL3gPaRx4IY05pX36DBo4Rnj+b2+A7HZFegdPUPrVMwS LZRk+ABrE3dobVu3QYtjBjNc+U6sAgDRJyewGslLQNITrwNyWXRM5O+NjGOitY3a +xlhmyuod2zNddz14oIDcJ/3Y7NUNlvSkldmLJhBnaX9WvLSVvBlK8J5fsZpwDAI TJ6kzUd97C4A7t5twX88zIHWhZMFqxsHXACQkpY5Uq3JIp5VVwPHLbuh7THmKi2e IiGVynGT9Agt0XwgeCQRRvD048cHX0MMnSIIo7VB1eXihMTXvcVxsZnaGS16lO5U H8uhegeB9EbFx/lcg+dGpVmbpjqQ+jbrxu0wBx1Gfj0KHe4t/ajkdTnUQuNpUV+o r/8R2uL1H33dYjDtJDRPTOJEttNDcssZSt7EM1Rp1UdlYpWgXVP/wiKf5o4Ad9yZ /H2PTczcHSqOUEvXFfLUmy+T0u1htHCzi6FqI+wBh9PMBbDtZhetWTfVbjy4GIQr AzmGESjzrj97WQWum7ArmtuWYLnOfbsAtgj8XO1q94OfYoaTAkreOQndym/l+lRG mfScLpmdOJtY+ghicS6l =97mC -----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54A6DB61.9060607>