Date: Sat, 4 Feb 2017 21:54:39 -0600 From: Larry Rosenman <ler@lerctr.org> To: Steven Hartland <killing@multiplay.co.uk> Cc: Larry Rosenman <ler@FreeBSD.org>, Freebsd fs <freebsd-fs@freebsd.org> Subject: Re: 16.0E ExpandSize? -- New Server Message-ID: <20170205035438.6gc2ybg6otidzpaz@borg.lerctr.org> In-Reply-To: <8387d38f-3185-8c07-396b-602c708002a6@multiplay.co.uk> References: <22e1bfc5840d972cf93643733682cda1@FreeBSD.org> <f2600a53-0dc1-9f41-1405-ed22d96d30cf@multiplay.co.uk> <8a710dc75c129f58b0372eeaeca575b5@FreeBSD.org> <aef02eb0-0888-6fea-a4b8-4033ca56f4a3@multiplay.co.uk> <d3181bd00c827fb99fbcebe6fe097ef8@FreeBSD.org> <a3d78923-5046-11c8-daea-713eacf47bd2@multiplay.co.uk> <ffc24b7bfacd265d637b633566bbaa51@FreeBSD.org> <96534515-4fcb-774e-a599-8d48aec930cd@multiplay.co.uk> <a98b3a3da1665c8eac6160633a0bc778@FreeBSD.org> <8387d38f-3185-8c07-396b-602c708002a6@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
I saw it was accepted upstream. Can it be committed to FreeBSD? On Wed, Feb 01, 2017 at 02:43:51AM +0000, Steven Hartland wrote: > Thanks I've put a PR in upstream to get some eyes on the fix. > https://github.com/openzfs/openzfs/pull/296 > > If no objections are raised to the approach I've used I'll commit the fix to > HEAD too. > > On 01/02/2017 02:31, Larry Rosenman wrote: > > > > no grief that I can see: > > > > borg-new /home/ler $ sudo zdb > > Password: > > zroot: > > version: 5000 > > name: 'zroot' > > state: 0 > > txg: 96143 > > pool_guid: 11945658884309024932 > > hostid: 3619181042 > > hostname: '' > > com.delphix:has_per_vdev_zaps > > vdev_children: 1 > > vdev_tree: > > type: 'root' > > id: 0 > > guid: 11945658884309024932 > > create_txg: 4 > > children[0]: > > type: 'raidz' > > id: 0 > > guid: 7596925654112466913 > > nparity: 1 > > metaslab_array: 42 > > metaslab_shift: 36 > > ashift: 12 > > asize: 11947471798272 > > is_log: 0 > > create_txg: 4 > > com.delphix:vdev_zap_top: 35 > > children[0]: > > type: 'disk' > > id: 0 > > guid: 1443238581175429852 > > path: '/dev/mfid4p4' > > whole_disk: 1 > > DTL: 137 > > create_txg: 4 > > com.delphix:vdev_zap_leaf: 131 > > children[1]: > > type: 'disk' > > id: 1 > > guid: 1865792721003775978 > > path: '/dev/mfid0p4' > > whole_disk: 1 > > DTL: 133 > > create_txg: 4 > > com.delphix:vdev_zap_leaf: 37 > > children[2]: > > type: 'disk' > > id: 2 > > guid: 12541720522827927342 > > path: '/dev/mfid1p4' > > whole_disk: 1 > > DTL: 132 > > create_txg: 4 > > com.delphix:vdev_zap_leaf: 38 > > children[3]: > > type: 'disk' > > id: 3 > > guid: 13053934791777776444 > > path: '/dev/mfid3p4' > > whole_disk: 1 > > DTL: 136 > > create_txg: 4 > > com.delphix:vdev_zap_leaf: 135 > > children[4]: > > type: 'disk' > > id: 4 > > guid: 4432707573898874857 > > path: '/dev/mfid2p4' > > whole_disk: 1 > > DTL: 130 > > create_txg: 4 > > com.delphix:vdev_zap_leaf: 40 > > children[5]: > > type: 'disk' > > id: 5 > > guid: 5106293125005422556 > > path: '/dev/mfid5p4' > > whole_disk: 1 > > DTL: 129 > > create_txg: 4 > > com.delphix:vdev_zap_leaf: 41 > > features_for_read: > > com.delphix:hole_birth > > com.delphix:embedded_data > > borg-new /home/ler $ sudo zpool list -v > > NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT > > zroot 10.8T 94.3G 10.7T - 0% 0% 1.00x ONLINE - > > raidz1 10.8T 94.3G 10.7T - 0% 0% > > mfid4p4 - - - - - - > > mfid0p4 - - - - - - > > mfid1p4 - - - - - - > > mfid3p4 - - - - - - > > mfid2p4 - - - - - - > > mfid5p4 - - - - - - > > borg-new /home/ler $ sudo zpool get all > > NAME PROPERTY VALUE SOURCE > > zroot size 10.8T - > > zroot capacity 0% - > > zroot altroot - default > > zroot health ONLINE - > > zroot guid 11945658884309024932 default > > zroot version - default > > zroot bootfs zroot/ROOT/default local > > zroot delegation on default > > zroot autoreplace off default > > zroot cachefile - default > > zroot failmode wait default > > zroot listsnapshots off default > > zroot autoexpand off default > > zroot dedupditto 0 default > > zroot dedupratio 1.00x - > > zroot free 10.7T - > > zroot allocated 94.3G - > > zroot readonly off - > > zroot comment - default > > zroot expandsize - - > > zroot freeing 0 default > > zroot fragmentation 0% - > > zroot leaked 0 default > > zroot feature@async_destroy enabled local > > zroot feature@empty_bpobj active local > > zroot feature@lz4_compress active local > > zroot feature@multi_vdev_crash_dump enabled local > > zroot feature@spacemap_histogram active local > > zroot feature@enabled_txg active local > > zroot feature@hole_birth active local > > zroot feature@extensible_dataset enabled local > > zroot feature@embedded_data active local > > zroot feature@bookmarks enabled local > > zroot feature@filesystem_limits enabled local > > zroot feature@large_blocks enabled local > > zroot feature@sha512 enabled local > > zroot feature@skein enabled local > > borg-new /home/ler $ > > > > > > > > On 01/31/2017 5:22 pm, Steven Hartland wrote: > > > > > Yep > > > > > > On 31/01/2017 21:49, Larry Rosenman wrote: > > > > > > > > revert the other patch and apply this one? > > > > > > > > > > > > > > > > On 01/31/2017 3:47 pm, Steven Hartland wrote: > > > > > > > > Hmm, looks like there's also a bug in the way vdev_min_asize is > > > > calculated for raidz as it can and has resulted in child > > > > min_asize which won't provided enough space for the parent due > > > > to the use of unrounded integer division. > > > > > > > > 1981411579221 * 6 = 11888469475326 < 11888469475328 > > > > > > > > You should have vdev_min_asize: 1981411579222 for your children. > > > > > > > > Updated patch attached, however calculation still isn't 100% > > > > reversible so may need work, however it does now ensure that the > > > > children will provide enough capacity for min_asize even if all > > > > of them are shrunk to their individual min_asize, which I > > > > believe previously may not have been the case. > > > > > > > > This isn't related to the incorrect EXPANDSZ output, but would > > > > be good if you could confirm it doesn't cause any issues for > > > > your pool given its state. > > > > > > > > On 31/01/2017 21:00, Larry Rosenman wrote: > > > > > > > > borg-new /home/ler $ sudo ./vdev-stats.d > > > > Password: > > > > vdev_path: n/a, vdev_max_asize: 0, vdev_asize: 0, > > > > vdev_min_asize: 0 > > > > vdev_path: n/a, vdev_max_asize: 11947471798272, vdev_asize: > > > > 11947478089728, vdev_min_asize: 11888469475328 > > > > vdev_path: /dev/mfid4p4, vdev_max_asize: 1991245299712, > > > > vdev_asize: 1991245299712, vdev_min_asize: 1981411579221 > > > > vdev_path: /dev/mfid0p4, vdev_max_asize: 1991246348288, > > > > vdev_asize: 1991246348288, vdev_min_asize: 1981411579221 > > > > vdev_path: /dev/mfid1p4, vdev_max_asize: 1991246348288, > > > > vdev_asize: 1991246348288, vdev_min_asize: 1981411579221 > > > > vdev_path: /dev/mfid3p4, vdev_max_asize: 1991247921152, > > > > vdev_asize: 1991247921152, vdev_min_asize: 1981411579221 > > > > vdev_path: /dev/mfid2p4, vdev_max_asize: 1991246348288, > > > > vdev_asize: 1991246348288, vdev_min_asize: 1981411579221 > > > > vdev_path: /dev/mfid5p4, vdev_max_asize: 1991246348288, > > > > vdev_asize: 1991246348288, vdev_min_asize: 1981411579221 > > > > ^C > > > > > > > > borg-new /home/ler $ > > > > > > > > > > > > borg-new /home/ler $ sudo zpool list -v > > > > Password: > > > > NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT > > > > zroot 10.8T 94.3G 10.7T 16.0E 0% 0% 1.00x ONLINE - > > > > raidz1 10.8T 94.3G 10.7T 16.0E 0% 0% > > > > mfid4p4 - - - - - - > > > > mfid0p4 - - - - - - > > > > mfid1p4 - - - - - - > > > > mfid3p4 - - - - - - > > > > mfid2p4 - - - - - - > > > > mfid5p4 - - - - - - > > > > borg-new /home/ler $ > > > > > > > > > > > > On 01/31/2017 2:37 pm, Steven Hartland wrote: > > > > > > > > In that case based on your zpool history I suspect that > > > > the original mfid4p4 was the same size as mfid0p4 > > > > (1991246348288) but its been replaced with a drive which > > > > is (1991245299712), slightly smaller. > > > > > > > > This smaller size results in a max_asize of > > > > 1991245299712 * 6 instead of original 1991246348288* 6. > > > > > > > > Now given the way min_asize (the value used to check if > > > > the device size is acceptable) is rounded to the the > > > > nearest metaslab I believe that replace would be allowed. > > > > https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c#L4947 > > > > > > > > Now the problem is that on open the calculated asize is > > > > only updated if its expanding: > > > > https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c#L1424 > > > > > > > > The updated dtrace file outputs vdev_min_asize which > > > > should confirm my suspicion about why the replace was > > > > allowed. > > > > > > > > On 31/01/2017 19:05, Larry Rosenman wrote: > > > > > > > > I've replaced some disks due to failure, and some of > > > > the pariition sizes are different. > > > > > > > > > > > > autoexpand is off: > > > > > > > > borg-new /home/ler $ zpool get all zroot > > > > NAME PROPERTY VALUE SOURCE > > > > zroot size 10.8T - > > > > zroot capacity 0% - > > > > zroot altroot - default > > > > zroot health ONLINE - > > > > zroot guid 11945658884309024932 default > > > > zroot version - default > > > > zroot bootfs zroot/ROOT/default local > > > > zroot delegation on default > > > > zroot autoreplace off default > > > > zroot cachefile - default > > > > zroot failmode wait default > > > > zroot listsnapshots off default > > > > zroot autoexpand off default > > > > zroot dedupditto 0 default > > > > zroot dedupratio 1.00x - > > > > zroot free 10.7T - > > > > zroot allocated 94.3G - > > > > zroot readonly off - > > > > zroot comment - default > > > > zroot expandsize 16.0E - > > > > zroot freeing 0 default > > > > zroot fragmentation 0% - > > > > zroot leaked 0 default > > > > zroot feature@async_destroy enabled local > > > > zroot feature@empty_bpobj active local > > > > zroot feature@lz4_compress active local > > > > zroot feature@multi_vdev_crash_dump enabled local > > > > zroot feature@spacemap_histogram active local > > > > zroot feature@enabled_txg active local > > > > zroot feature@hole_birth active local > > > > zroot feature@extensible_dataset enabled local > > > > zroot feature@embedded_data active local > > > > zroot feature@bookmarks enabled local > > > > zroot feature@filesystem_limits enabled local > > > > zroot feature@large_blocks enabled local > > > > zroot feature@sha512 enabled local > > > > zroot feature@skein enabled local > > > > borg-new /home/ler $ > > > > > > > > > > > > borg-new /home/ler $ gpart show > > > > => 40 3905945520 mfid0 GPT (1.8T) > > > > 40 1600 1 efi (800K) > > > > 1640 1024 2 freebsd-boot (512K) > > > > 2664 1432 - free - (716K) > > > > 4096 16777216 3 freebsd-swap (8.0G) > > > > 16781312 3889162240 4 freebsd-zfs (1.8T) > > > > 3905943552 2008 - free - (1.0M) > > > > > > > > => 40 3905945520 mfid1 GPT (1.8T) > > > > 40 1600 1 efi (800K) > > > > 1640 1024 2 freebsd-boot (512K) > > > > 2664 1432 - free - (716K) > > > > 4096 16777216 3 freebsd-swap (8.0G) > > > > 16781312 3889162240 4 freebsd-zfs (1.8T) > > > > 3905943552 2008 - free - (1.0M) > > > > > > > > => 40 3905945520 mfid2 GPT (1.8T) > > > > 40 1600 1 efi (800K) > > > > 1640 1024 2 freebsd-boot (512K) > > > > 2664 1432 - free - (716K) > > > > 4096 16777216 3 freebsd-swap (8.0G) > > > > 16781312 3889162240 4 freebsd-zfs (1.8T) > > > > 3905943552 2008 - free - (1.0M) > > > > > > > > => 40 3905945520 mfid3 GPT (1.8T) > > > > 40 1600 1 efi (800K) > > > > 1640 1024 2 freebsd-boot (512K) > > > > 2664 16777216 3 freebsd-swap (8.0G) > > > > 16779880 3889165680 4 freebsd-zfs (1.8T) > > > > > > > > => 40 3905945520 mfid5 GPT (1.8T) > > > > 40 1600 1 efi (800K) > > > > 1640 1024 2 freebsd-boot (512K) > > > > 2664 1432 - free - (716K) > > > > 4096 16777216 3 freebsd-swap (8.0G) > > > > 16781312 3889162240 4 freebsd-zfs (1.8T) > > > > 3905943552 2008 - free - (1.0M) > > > > > > > > => 40 3905945520 mfid4 GPT (1.8T) > > > > 40 1600 1 efi (800K) > > > > 1640 1024 2 freebsd-boot (512K) > > > > 2664 1432 - free - (716K) > > > > 4096 16777216 3 freebsd-swap (8.0G) > > > > 16781312 3889160192 4 freebsd-zfs (1.8T) > > > > 3905941504 4056 - free - (2.0M) > > > > > > > > borg-new /home/ler $ > > > > > > > > > > > > this system was built last week, and I **CAN** > > > > rebuild it if necessary, but I didn't do anything > > > > strange (so I thought :) ) > > > > > > > > > > > > > > > > > > > > On 01/31/2017 12:30 pm, Steven Hartland wrote: > > > > > > > > Your issue is the reported vdev_max_asize > > > > > vdev_asize: > > > > vdev_max_asize: 11947471798272 > > > > vdev_asize: 11947478089728 > > > > > > > > max asize is smaller than asize by 6291456 > > > > > > > > For raidz1 Xsize should be the smallest disk > > > > Xsize * disks so: > > > > 1991245299712 * 6 = 11947471798272 > > > > > > > > So your max asize looks right but asize looks > > > > too big > > > > > > > > Expand Size is calculated by: > > > > if (vd->vdev_aux == NULL && tvd != NULL && > > > > vd->vdev_max_asize != 0) { > > > > vs->vs_esize = P2ALIGN(vd->vdev_max_asize - > > > > vd->vdev_asize, > > > > 1ULL << tvd->vdev_ms_shift); > > > > } > > > > > > > > So the question is why is asize too big? > > > > > > > > Given you seem to have some random disk sizes do > > > > you have auto expand turned on? > > > > > > > > On 31/01/2017 17:39, Larry Rosenman wrote: > > > > > > > > vdev_path: n/a, vdev_max_asize: > > > > 11947471798272, vdev_asize: 11947478089728 > > > > > > > > > > > > -- Larry Rosenman > > > > http://people.freebsd.org/~ler > > > > <http://people.freebsd.org/%7Eler> > > > > Phone: +1 214-642-9640 E-Mail: > > > > ler@FreeBSD.org <mailto:ler@FreeBSD.org> > > > > US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281 > > > > > > > > > > > > -- Larry Rosenman http://people.freebsd.org/~ler > > > > <http://people.freebsd.org/%7Eler> > > > > Phone: +1 214-642-9640 E-Mail: > > > > ler@FreeBSD.org <mailto:ler@FreeBSD.org> > > > > US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281 > > > > > > > > > > > > -- > > > > Larry Rosenman http://people.freebsd.org/~ler > > > > <http://people.freebsd.org/%7Eler> > > > > Phone: +1 214-642-9640 E-Mail: ler@FreeBSD.org > > > > <mailto:ler@FreeBSD.org> > > > > US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281 > > > > > > -- > > Larry Rosenman http://people.freebsd.org/~ler > > <http://people.freebsd.org/%7Eler> > > Phone: +1 214-642-9640 E-Mail: ler@FreeBSD.org > > <mailto:ler@FreeBSD.org> > > US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281 > -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170205035438.6gc2ybg6otidzpaz>