Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Jan 2017 23:22:22 +0000
From:      Steven Hartland <killing@multiplay.co.uk>
To:        Larry Rosenman <ler@FreeBSD.org>
Cc:        Freebsd fs <freebsd-fs@freebsd.org>
Subject:   Re: 16.0E ExpandSize? -- New Server
Message-ID:  <96534515-4fcb-774e-a599-8d48aec930cd@multiplay.co.uk>
In-Reply-To: <ffc24b7bfacd265d637b633566bbaa51@FreeBSD.org>
References:  <00db0ab7243ce6368c246ae20f9c075a@FreeBSD.org> <1a69057c-dc59-9b78-9762-4f98a071105e@multiplay.co.uk> <ce5a1d39612d694077accda33266a3ab@FreeBSD.org> <ad07e84e-f297-362a-1398-c5503bb56a8d@multiplay.co.uk> <35a9034f91542bb1329ac5104bf3b773@FreeBSD.org> <76fc9505-f681-0de0-fe0c-5624b29de321@multiplay.co.uk> <22e1bfc5840d972cf93643733682cda1@FreeBSD.org> <f2600a53-0dc1-9f41-1405-ed22d96d30cf@multiplay.co.uk> <8a710dc75c129f58b0372eeaeca575b5@FreeBSD.org> <aef02eb0-0888-6fea-a4b8-4033ca56f4a3@multiplay.co.uk> <d3181bd00c827fb99fbcebe6fe097ef8@FreeBSD.org> <a3d78923-5046-11c8-daea-713eacf47bd2@multiplay.co.uk> <ffc24b7bfacd265d637b633566bbaa51@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Yep

On 31/01/2017 21:49, Larry Rosenman wrote:
>
> revert the other patch and apply this one?
>
>
>
> On 01/31/2017 3:47 pm, Steven Hartland wrote:
>
>> Hmm, looks like there's also a bug in the way vdev_min_asize is 
>> calculated for raidz as it can and has resulted in child min_asize 
>> which won't provided enough space for the parent due to the use of 
>> unrounded integer division.
>>
>> 1981411579221 * 6 = 11888469475326 < 11888469475328
>>
>> You should have vdev_min_asize: 1981411579222 for your children.
>>
>> Updated patch attached, however calculation still isn't 100% 
>> reversible so may need work, however it does now ensure that the 
>> children will provide enough capacity for min_asize even if all of 
>> them are shrunk to their individual min_asize, which I believe 
>> previously may not have been the case.
>>
>> This isn't related to the incorrect EXPANDSZ output, but would be 
>> good if you could confirm it doesn't cause any issues for your pool 
>> given its state.
>>
>> On 31/01/2017 21:00, Larry Rosenman wrote:
>>>
>>> borg-new /home/ler $ sudo ./vdev-stats.d
>>> Password:
>>> vdev_path: n/a, vdev_max_asize: 0, vdev_asize: 0, vdev_min_asize: 0
>>> vdev_path: n/a, vdev_max_asize: 11947471798272, vdev_asize: 
>>> 11947478089728, vdev_min_asize: 11888469475328
>>> vdev_path: /dev/mfid4p4, vdev_max_asize: 1991245299712, vdev_asize: 
>>> 1991245299712, vdev_min_asize: 1981411579221
>>> vdev_path: /dev/mfid0p4, vdev_max_asize: 1991246348288, vdev_asize: 
>>> 1991246348288, vdev_min_asize: 1981411579221
>>> vdev_path: /dev/mfid1p4, vdev_max_asize: 1991246348288, vdev_asize: 
>>> 1991246348288, vdev_min_asize: 1981411579221
>>> vdev_path: /dev/mfid3p4, vdev_max_asize: 1991247921152, vdev_asize: 
>>> 1991247921152, vdev_min_asize: 1981411579221
>>> vdev_path: /dev/mfid2p4, vdev_max_asize: 1991246348288, vdev_asize: 
>>> 1991246348288, vdev_min_asize: 1981411579221
>>> vdev_path: /dev/mfid5p4, vdev_max_asize: 1991246348288, vdev_asize: 
>>> 1991246348288, vdev_min_asize: 1981411579221
>>> ^C
>>>
>>> borg-new /home/ler $
>>>
>>>
>>> borg-new /home/ler $ sudo zpool list -v
>>> Password:
>>> NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
>>> zroot 10.8T 94.3G 10.7T 16.0E 0% 0% 1.00x ONLINE -
>>> raidz1 10.8T 94.3G 10.7T 16.0E 0% 0%
>>> mfid4p4 - - - - - -
>>> mfid0p4 - - - - - -
>>> mfid1p4 - - - - - -
>>> mfid3p4 - - - - - -
>>> mfid2p4 - - - - - -
>>> mfid5p4 - - - - - -
>>> borg-new /home/ler $
>>>
>>>
>>> On 01/31/2017 2:37 pm, Steven Hartland wrote:
>>>
>>>     In that case based on your zpool history I suspect that the
>>>     original mfid4p4 was the same size as mfid0p4 (1991246348288)
>>>     but its been replaced with a drive which is (1991245299712),
>>>     slightly smaller.
>>>
>>>     This smaller size results in a max_asize of 1991245299712 * 6
>>>     instead of original 1991246348288* 6.
>>>
>>>     Now given the way min_asize (the value used to check if the
>>>     device size is acceptable) is rounded to the the nearest
>>>     metaslab I believe that replace would be allowed.
>>>     https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c#L4947
>>>
>>>     Now the problem is that on open the calculated asize is only
>>>     updated if its expanding:
>>>     https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c#L1424
>>>
>>>     The updated dtrace file outputs vdev_min_asize which should
>>>     confirm my suspicion about why the replace was allowed.
>>>
>>>     On 31/01/2017 19:05, Larry Rosenman wrote:
>>>
>>>         I've replaced some disks due to failure, and some of the
>>>         pariition sizes are different.
>>>
>>>
>>>         autoexpand is off:
>>>
>>>         borg-new /home/ler $ zpool get all zroot
>>>         NAME PROPERTY VALUE SOURCE
>>>         zroot size 10.8T -
>>>         zroot capacity 0% -
>>>         zroot altroot - default
>>>         zroot health ONLINE -
>>>         zroot guid 11945658884309024932 default
>>>         zroot version - default
>>>         zroot bootfs zroot/ROOT/default local
>>>         zroot delegation on default
>>>         zroot autoreplace off default
>>>         zroot cachefile - default
>>>         zroot failmode wait default
>>>         zroot listsnapshots off default
>>>         zroot autoexpand off default
>>>         zroot dedupditto 0 default
>>>         zroot dedupratio 1.00x -
>>>         zroot free 10.7T -
>>>         zroot allocated 94.3G -
>>>         zroot readonly off -
>>>         zroot comment - default
>>>         zroot expandsize 16.0E -
>>>         zroot freeing 0 default
>>>         zroot fragmentation 0% -
>>>         zroot leaked 0 default
>>>         zroot feature@async_destroy enabled local
>>>         zroot feature@empty_bpobj active local
>>>         zroot feature@lz4_compress active local
>>>         zroot feature@multi_vdev_crash_dump enabled local
>>>         zroot feature@spacemap_histogram active local
>>>         zroot feature@enabled_txg active local
>>>         zroot feature@hole_birth active local
>>>         zroot feature@extensible_dataset enabled local
>>>         zroot feature@embedded_data active local
>>>         zroot feature@bookmarks enabled local
>>>         zroot feature@filesystem_limits enabled local
>>>         zroot feature@large_blocks enabled local
>>>         zroot feature@sha512 enabled local
>>>         zroot feature@skein enabled local
>>>         borg-new /home/ler $
>>>
>>>
>>>         borg-new /home/ler $ gpart show
>>>         => 40 3905945520 mfid0 GPT (1.8T)
>>>         40 1600 1 efi (800K)
>>>         1640 1024 2 freebsd-boot (512K)
>>>         2664 1432 - free - (716K)
>>>         4096 16777216 3 freebsd-swap (8.0G)
>>>         16781312 3889162240 4 freebsd-zfs (1.8T)
>>>         3905943552 2008 - free - (1.0M)
>>>
>>>         => 40 3905945520 mfid1 GPT (1.8T)
>>>         40 1600 1 efi (800K)
>>>         1640 1024 2 freebsd-boot (512K)
>>>         2664 1432 - free - (716K)
>>>         4096 16777216 3 freebsd-swap (8.0G)
>>>         16781312 3889162240 4 freebsd-zfs (1.8T)
>>>         3905943552 2008 - free - (1.0M)
>>>
>>>         => 40 3905945520 mfid2 GPT (1.8T)
>>>         40 1600 1 efi (800K)
>>>         1640 1024 2 freebsd-boot (512K)
>>>         2664 1432 - free - (716K)
>>>         4096 16777216 3 freebsd-swap (8.0G)
>>>         16781312 3889162240 4 freebsd-zfs (1.8T)
>>>         3905943552 2008 - free - (1.0M)
>>>
>>>         => 40 3905945520 mfid3 GPT (1.8T)
>>>         40 1600 1 efi (800K)
>>>         1640 1024 2 freebsd-boot (512K)
>>>         2664 16777216 3 freebsd-swap (8.0G)
>>>         16779880 3889165680 4 freebsd-zfs (1.8T)
>>>
>>>         => 40 3905945520 mfid5 GPT (1.8T)
>>>         40 1600 1 efi (800K)
>>>         1640 1024 2 freebsd-boot (512K)
>>>         2664 1432 - free - (716K)
>>>         4096 16777216 3 freebsd-swap (8.0G)
>>>         16781312 3889162240 4 freebsd-zfs (1.8T)
>>>         3905943552 2008 - free - (1.0M)
>>>
>>>         => 40 3905945520 mfid4 GPT (1.8T)
>>>         40 1600 1 efi (800K)
>>>         1640 1024 2 freebsd-boot (512K)
>>>         2664 1432 - free - (716K)
>>>         4096 16777216 3 freebsd-swap (8.0G)
>>>         16781312 3889160192 4 freebsd-zfs (1.8T)
>>>         3905941504 4056 - free - (2.0M)
>>>
>>>         borg-new /home/ler $
>>>
>>>
>>>         this system was built last week, and I **CAN** rebuild it if
>>>         necessary, but I didn't do anything strange (so I thought :) )
>>>
>>>
>>>
>>>
>>>         On 01/31/2017 12:30 pm, Steven Hartland wrote:
>>>
>>>             Your issue is the reported vdev_max_asize > vdev_asize:
>>>             vdev_max_asize: 11947471798272
>>>             vdev_asize:     11947478089728
>>>
>>>             max asize is smaller than asize by 6291456
>>>
>>>             For raidz1 Xsize should be the smallest disk Xsize *
>>>             disks so:
>>>             1991245299712 * 6 = 11947471798272
>>>
>>>             So your max asize looks right but asize looks too big
>>>
>>>             Expand Size is calculated by:
>>>             if (vd->vdev_aux == NULL && tvd != NULL &&
>>>             vd->vdev_max_asize != 0) {
>>>                 vs->vs_esize = P2ALIGN(vd->vdev_max_asize -
>>>             vd->vdev_asize,
>>>                     1ULL << tvd->vdev_ms_shift);
>>>             }
>>>
>>>             So the question is why is asize too big?
>>>
>>>             Given you seem to have some random disk sizes do you
>>>             have auto expand turned on?
>>>
>>>             On 31/01/2017 17:39, Larry Rosenman wrote:
>>>
>>>                 vdev_path: n/a, vdev_max_asize: 11947471798272,
>>>                 vdev_asize: 11947478089728
>>>
>>>
>>>         -- 
>>>         Larry Rosenman http://people.freebsd.org/~ler
>>>         <http://people.freebsd.org/%7Eler>;
>>>         Phone: +1 214-642-9640                 E-Mail:
>>>         ler@FreeBSD.org <mailto:ler@FreeBSD.org>
>>>         US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
>>>
>>>
>>> -- 
>>> Larry Rosenman http://people.freebsd.org/~ler 
>>> <http://people.freebsd.org/%7Eler>;
>>> Phone: +1 214-642-9640                 E-Mail: ler@FreeBSD.org 
>>> <mailto:ler@FreeBSD.org>
>>> US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
>
>
> -- 
> Larry Rosenman http://people.freebsd.org/~ler 
> <http://people.freebsd.org/%7Eler>;
> Phone: +1 214-642-9640                 E-Mail: ler@FreeBSD.org 
> <mailto:ler@FreeBSD.org>
> US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?96534515-4fcb-774e-a599-8d48aec930cd>