Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 02 Jan 2015 09:54:41 -0800
From:      Xin Li <delphij@delphij.net>
To:        Steven Hartland <steven@multiplay.co.uk>, d@delphij.net,  src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r276123 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Message-ID:  <54A6DB61.9060607@delphij.net>
In-Reply-To: <54A5AC21.5070802@multiplay.co.uk>
References:  <201412230931.sBN9VPMK017968@svn.freebsd.org> <54A35B88.9090102@delphij.net> <54A39153.8040905@freebsd.org> <54A3ACEF.70905@delphij.net> <54A5AC21.5070802@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 01/01/15 12:20, Steven Hartland wrote:
> 
> On 31/12/2014 07:59, Xin Li wrote:
>> Hi, Steven,
>> 
>> On 12/30/14 22:01, Steven Hartland wrote:
>>> On 31/12/2014 02:12, Xin Li wrote:
>>>> On 12/23/14 01:31, Steven Hartland wrote:
>>>>> Author: smh Date: Tue Dec 23 09:31:24 2014 New Revision:
>>>>> 276123 URL:
>>>>> https://svnweb.freebsd.org/changeset/base/276123
>>>>> 
>>>>> Log: Always sync the global ZFS config cache to reflect the
>>>>> new mosconfig This fixes out of date zpool.cache for root
>>>>> pools, which can cause issues such as confusion of zdb
>>>>> etc. MFC after:    1 month
>>>>> 
>>>>> Modified: 
>>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c
>>>>>
>>>>>
>>>>> 
Modified:
>>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c
>>>>>
>>>>> 
==============================================================================
>>>>> 
>>>>> 
>>>>> --- 
>>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c
>>>>>
>>>>> 
Tue Dec 23 08:51:30 2014    (r276122)
>>>>> +++ 
>>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c
>>>>>
>>>>> 
Tue Dec 23 09:31:24 2014    (r276123)
>>>>> @@ -536,8 +536,7 @@ spa_config_update(spa_t *spa, int
>>>>> what) /* * Update the global config cache to reflect the
>>>>> new mosconfig. */ -    if (!spa->spa_is_root) -
>>>>> spa_config_sync(spa, B_FALSE, what != 
>>>>> SPA_CONFIG_UPDATE_POOL); +    spa_config_sync(spa, B_FALSE,
>>>>> what != SPA_CONFIG_UPDATE_POOL); if (what ==
>>>>> SPA_CONFIG_UPDATE_POOL) spa_config_update(spa,
>>>>> SPA_CONFIG_UPDATE_VDEVS);
>>>> It seems like that this change breaks systems where not all
>>>> pools are available (e.g. some of pools are encrypted) at
>>>> boot time, by removing all these pools from the cache file.
>>>> 
>>>> As a result, on the next boot, these pools would not be
>>>> imported even when the devices are available (geli attached),
>>>> and reverting this change would restore the system to its
>>>> previous behavior.
>>>> 
>>>> Perhaps it have exposed an existing bug?
>>> I've managed to reproduce this here with mdX backed test pools,
>>> and looking into it this is due to spa_config_sync:
>>> 
>>> /* * Skip over our own pool if we're about to remove *
>>> ourselves from the spa namespace or any pool that * is
>>> readonly. Since we cannot guarantee that a * readonly pool
>>> would successfully import upon reboot, * we don't allow them to
>>> be written to the cache file. */ if ((spa == target &&
>>> removing) || !spa_writeable(spa)) { continue; }
>>> 
>>> This was added by upstream: 
>>> https://github.com/illumos/illumos-gate/commit/fb02ae025247e3b662600e5a9c1b4c33ecab7d72
>>>
>>>
>>>
>>> 
https://www.illumos.org/issues/3639
>>> 
>>> It seems the desired behavior was to exclude read only pools,
>>> to prevent them being mounted writable on reboot, but the check
>>> is too wide and also excludes unavailable pools too.
>>> 
>>> The attached patch corrects the test to only check writable on
>>> active pools so I believe should fix the issue your seeing.
>>> 
>>> If you can test it and lmk that would be great.
>> This is an improvement (the attaching of pool is now working) but
>> the cached configuration is somewhat corrupted, it shows many
>> lines of vdev_stats[] contents.
>> 
>> I'll take a look at the code tomorrow and see if I can do some 
>> instruments and possibly figure out the underlying issue.
>> 
>> Cheers,
> Looks like the spa_config_sync makes the assumption that the last
> call to spa_config_generate was done with getstats = 0.
> 
> This is now an invalid assumption so we need to ensure that stats
> are removed from the config nvl before persisting to disk.
> 
> The attached patch, which supersedes the last one I sent, fixes
> that in my tests here.
> 
> Let me know if it fixes your test case too?

Yes, this fixes the problem, thanks for working on it.

I'm not fully convinced with the approach though -- shouldn't we
update the cachefile (again) when the pool become available?

Cheers,
- -- 
Xin LI <delphij@delphij.net>    https://www.delphij.net/
FreeBSD - The Power to Serve!           Live free or die
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.1.1 (FreeBSD)

iQIcBAEBCgAGBQJUptteAAoJEJW2GBstM+nsrKIQAIGex0+PbAXgMzXgKh4dB+gw
ZX2VN6XBFYsG0Eu7m0ZNDv5cdYayLEM5gL2RXd7XTLssDXFi7TecWOmrZydXawES
vhc8uvjFboZN8jxlKBw+AL3gPaRx4IY05pX36DBo4Rnj+b2+A7HZFegdPUPrVMwS
LZRk+ABrE3dobVu3QYtjBjNc+U6sAgDRJyewGslLQNITrwNyWXRM5O+NjGOitY3a
+xlhmyuod2zNddz14oIDcJ/3Y7NUNlvSkldmLJhBnaX9WvLSVvBlK8J5fsZpwDAI
TJ6kzUd97C4A7t5twX88zIHWhZMFqxsHXACQkpY5Uq3JIp5VVwPHLbuh7THmKi2e
IiGVynGT9Agt0XwgeCQRRvD048cHX0MMnSIIo7VB1eXihMTXvcVxsZnaGS16lO5U
H8uhegeB9EbFx/lcg+dGpVmbpjqQ+jbrxu0wBx1Gfj0KHe4t/ajkdTnUQuNpUV+o
r/8R2uL1H33dYjDtJDRPTOJEttNDcssZSt7EM1Rp1UdlYpWgXVP/wiKf5o4Ad9yZ
/H2PTczcHSqOUEvXFfLUmy+T0u1htHCzi6FqI+wBh9PMBbDtZhetWTfVbjy4GIQr
AzmGESjzrj97WQWum7ArmtuWYLnOfbsAtgj8XO1q94OfYoaTAkreOQndym/l+lRG
mfScLpmdOJtY+ghicS6l
=97mC
-----END PGP SIGNATURE-----



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54A6DB61.9060607>