From owner-svn-src-all@freebsd.org Fri May 22 16:51:01 2020 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 83B272CE50D; Fri, 22 May 2020 16:51:01 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49TCDP36Vzz4YmW; Fri, 22 May 2020 16:51:01 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 659FD21C1D; Fri, 22 May 2020 16:51:01 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id 04MGp1i4010401; Fri, 22 May 2020 16:51:01 GMT (envelope-from avg@FreeBSD.org) Received: (from avg@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id 04MGp0ET010395; Fri, 22 May 2020 16:51:00 GMT (envelope-from avg@FreeBSD.org) Message-Id: <202005221651.04MGp0ET010395@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: avg set sender to avg@FreeBSD.org using -f From: Andriy Gapon Date: Fri, 22 May 2020 16:51:00 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r361391 - in stable/12: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/cmd/zpool cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/... X-SVN-Group: stable-12 X-SVN-Commit-Author: avg X-SVN-Commit-Paths: in stable/12: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/cmd/zpool cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common cddl/co... X-SVN-Commit-Revision: 361391 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 May 2020 16:51:01 -0000 Author: avg Date: Fri May 22 16:51:00 2020 New Revision: 361391 URL: https://svnweb.freebsd.org/changeset/base/361391 Log: MFC r354941,r354948: 10601 10757 Pool allocation classes MFV r354382,r354385: 10601 10757 Pool allocation classes illumos/illumos-gate@663207adb1669640c01c5ec6949ce78fd806efae https://github.com/illumos/illumos-gate/commit/663207adb1669640c01c5ec6949ce78fd806efae 10601 Pool allocation classes https://www.illumos.org/issues/10601 illumos port of ZoL Pool allocation classes. Includes at least these two commits: 441709695 Pool allocation classes misplacing small file blocks cc99f275a Pool allocation classes 10757 Add -gLp to zpool subcommands for alt vdev names https://www.illumos.org/issues/10757 Port from ZoL of d2f3e292d Add -gLp to zpool subcommands for alt vdev names Note that a subsequent ZoL commit changed -p to -P a77f29f93 Change full path subcommand flag from -p to -P Portions contributed by: Jerry Jelinek Portions contributed by: HÃ¥kan Johansson Portions contributed by: Richard Yao Portions contributed by: Chunwei Chen Portions contributed by: loli10K Author: Don Brady 11541 allocation_classes feature must be enabled to add log device illumos/illumos-gate@c1064fd7ce62fe763a4475e9988ffea3b22137de https://github.com/illumos/illumos-gate/commit/c1064fd7ce62fe763a4475e9988ffea3b22137de https://www.illumos.org/issues/11541 After the allocation_classes feature was integrated, one can no longer add a log device to a pool unless that feature is enabled. There is an explicit check for this, but it is unnecessary in the case of log devices, so we should handle this better instead of forcing the feature to be enabled. Author: Jerry Jelinek FreeBSD notes. I faithfully added the new -g, -L, -P flags, but only -g does something: vdev GUIDs are displayed instead of device names. -L, resolve symlinks, and -P, display full disk paths, do nothing at the moment. The use of special vdevs is backward compatible for read-only access, so root pools should be bootable, but exercise caution. MFV r354383: 10592 misc. metaslab and vdev related ZoL bug fixes illumos/illumos-gate@555d674d5d4b8191dc83723188349d28278b2431 https://github.com/illumos/illumos-gate/commit/555d674d5d4b8191dc83723188349d28278b2431 https://www.illumos.org/issues/10592 This is a collection of recent fixes from ZoL: 8eef997679b Error path in metaslab_load_impl() forgets to drop ms_sync_lock 928e8ad47d3 Introduce auxiliary metaslab histograms 425d3237ee8 Get rid of space_map_update() for ms_synced_length 6c926f426a2 Simplify log vdev removal code 21e7cf5da89 zdb -L should skip leak detection altogether df72b8bebe0 Rename range_tree_verify to range_tree_verify_not_present 75058f33034 Remove unused vdev_t fields Modified: stable/12/cddl/contrib/opensolaris/cmd/zdb/zdb.8 stable/12/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/12/cddl/contrib/opensolaris/cmd/zfs/zfs.8 stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool.8 stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool_vdev.c stable/12/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/12/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h stable/12/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c stable/12/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c stable/12/cddl/contrib/opensolaris/lib/libzpool/common/util.c stable/12/stand/libsa/zfs/zfsimpl.c stable/12/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.c stable/12/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.h stable/12/sys/cddl/contrib/opensolaris/common/zfs/zfs_prop.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_checkpoint.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/range_tree.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/space_map.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect_mapping.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_initialize.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/12/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c stable/12/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h Directory Properties: stable/12/ (props changed) Modified: stable/12/cddl/contrib/opensolaris/cmd/zdb/zdb.8 ============================================================================== --- stable/12/cddl/contrib/opensolaris/cmd/zdb/zdb.8 Fri May 22 16:29:09 2020 (r361390) +++ stable/12/cddl/contrib/opensolaris/cmd/zdb/zdb.8 Fri May 22 16:51:00 2020 (r361391) @@ -10,7 +10,7 @@ .\" .\" .\" Copyright 2012, Richard Lowe. -.\" Copyright (c) 2012, 2017 by Delphix. All rights reserved. +.\" Copyright (c) 2012, 2018 by Delphix. All rights reserved. .\" Copyright 2017 Nexenta Systems, Inc. .\" .Dd October 06, 2017 @@ -187,7 +187,7 @@ If the .Fl u option is also specified, also display the uberblocks on this device. .It Fl L -Disable leak tracing and the loading of space maps. +Disable leak detection and the loading of space maps. By default, .Nm verifies that all non-free blocks are referenced, which can be very expensive. Modified: stable/12/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/12/cddl/contrib/opensolaris/cmd/zdb/zdb.c Fri May 22 16:29:09 2020 (r361390) +++ stable/12/cddl/contrib/opensolaris/cmd/zdb/zdb.c Fri May 22 16:51:00 2020 (r361391) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2017 by Delphix. All rights reserved. + * Copyright (c) 2011, 2018 by Delphix. All rights reserved. * Copyright (c) 2014 Integros [integros.com] * Copyright 2017 Nexenta Systems, Inc. * Copyright (c) 2017, 2018 Lawrence Livermore National Security, LLC. @@ -785,18 +785,21 @@ dump_spacemap(objset_t *os, space_map_t *sm) return; (void) printf("space map object %llu:\n", - (longlong_t)sm->sm_phys->smp_object); - (void) printf(" smp_objsize = 0x%llx\n", - (longlong_t)sm->sm_phys->smp_objsize); + (longlong_t)sm->sm_object); + (void) printf(" smp_length = 0x%llx\n", + (longlong_t)sm->sm_phys->smp_length); (void) printf(" smp_alloc = 0x%llx\n", (longlong_t)sm->sm_phys->smp_alloc); + if (dump_opt['d'] < 6 && dump_opt['m'] < 4) + return; + /* * Print out the freelist entries in both encoded and decoded form. */ uint8_t mapshift = sm->sm_shift; int64_t alloc = 0; - uint64_t word; + uint64_t word, entry_id = 0; for (uint64_t offset = 0; offset < space_map_length(sm); offset += sizeof (word)) { @@ -804,11 +807,12 @@ dump_spacemap(objset_t *os, space_map_t *sm) sizeof (word), &word, DMU_READ_PREFETCH)); if (sm_entry_is_debug(word)) { - (void) printf("\t [%6llu] %s: txg %llu, pass %llu\n", - (u_longlong_t)(offset / sizeof (word)), + (void) printf("\t [%6llu] %s: txg %llu pass %llu\n", + (u_longlong_t)entry_id, ddata[SM_DEBUG_ACTION_DECODE(word)], (u_longlong_t)SM_DEBUG_TXG_DECODE(word), (u_longlong_t)SM_DEBUG_SYNCPASS_DECODE(word)); + entry_id++; continue; } @@ -846,7 +850,7 @@ dump_spacemap(objset_t *os, space_map_t *sm) (void) printf("\t [%6llu] %c range:" " %010llx-%010llx size: %06llx vdev: %06llu words: %u\n", - (u_longlong_t)(offset / sizeof (word)), + (u_longlong_t)entry_id, entry_type, (u_longlong_t)entry_off, (u_longlong_t)(entry_off + entry_run), (u_longlong_t)entry_run, @@ -856,8 +860,9 @@ dump_spacemap(objset_t *os, space_map_t *sm) alloc += entry_run; else alloc -= entry_run; + entry_id++; } - if ((uint64_t)alloc != space_map_allocated(sm)) { + if (alloc != space_map_allocated(sm)) { (void) printf("space_map_object alloc (%lld) INCONSISTENT " "with space map summary (%lld)\n", (longlong_t)space_map_allocated(sm), (longlong_t)alloc); @@ -921,23 +926,30 @@ dump_metaslab(metaslab_t *msp) SPACE_MAP_HISTOGRAM_SIZE, sm->sm_shift); } - if (dump_opt['d'] > 5 || dump_opt['m'] > 3) { - ASSERT(msp->ms_size == (1ULL << vd->vdev_ms_shift)); - - dump_spacemap(spa->spa_meta_objset, msp->ms_sm); - } + ASSERT(msp->ms_size == (1ULL << vd->vdev_ms_shift)); + dump_spacemap(spa->spa_meta_objset, msp->ms_sm); } static void print_vdev_metaslab_header(vdev_t *vd) { - (void) printf("\tvdev %10llu\n\t%-10s%5llu %-19s %-15s %-10s\n", - (u_longlong_t)vd->vdev_id, + vdev_alloc_bias_t alloc_bias = vd->vdev_alloc_bias; + const char *bias_str; + + bias_str = (alloc_bias == VDEV_BIAS_LOG || vd->vdev_islog) ? + VDEV_ALLOC_BIAS_LOG : + (alloc_bias == VDEV_BIAS_SPECIAL) ? VDEV_ALLOC_BIAS_SPECIAL : + (alloc_bias == VDEV_BIAS_DEDUP) ? VDEV_ALLOC_BIAS_DEDUP : + vd->vdev_islog ? "log" : ""; + + (void) printf("\tvdev %10llu %s\n" + "\t%-10s%5llu %-19s %-15s %-12s\n", + (u_longlong_t)vd->vdev_id, bias_str, "metaslabs", (u_longlong_t)vd->vdev_ms_count, "offset", "spacemap", "free"); - (void) printf("\t%15s %19s %15s %10s\n", + (void) printf("\t%15s %19s %15s %12s\n", "---------------", "-------------------", - "---------------", "-------------"); + "---------------", "------------"); } static void @@ -953,7 +965,7 @@ dump_metaslab_groups(spa_t *spa) vdev_t *tvd = rvd->vdev_child[c]; metaslab_group_t *mg = tvd->vdev_mg; - if (mg->mg_class != mc) + if (mg == NULL || mg->mg_class != mc) continue; metaslab_group_histogram_verify(mg); @@ -2807,6 +2819,7 @@ typedef struct zdb_blkstats { uint64_t zb_count; uint64_t zb_gangs; uint64_t zb_ditto_samevdev; + uint64_t zb_ditto_same_ms; uint64_t zb_psize_histogram[PSIZE_HISTO_SIZE]; } zdb_blkstats_t; @@ -2846,6 +2859,16 @@ typedef struct zdb_cb { uint32_t **zcb_vd_obsolete_counts; } zdb_cb_t; +/* test if two DVA offsets from same vdev are within the same metaslab */ +static boolean_t +same_metaslab(spa_t *spa, uint64_t vdev, uint64_t off1, uint64_t off2) +{ + vdev_t *vd = vdev_lookup_top(spa, vdev); + uint64_t ms_shift = vd->vdev_ms_shift; + + return ((off1 >> ms_shift) == (off2 >> ms_shift)); +} + static void zdb_count_block(zdb_cb_t *zcb, zilog_t *zilog, const blkptr_t *bp, dmu_object_type_t type) @@ -2857,6 +2880,8 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t *zilog, const b if (zilog && zil_bp_tree_add(zilog, bp) != 0) return; + spa_config_enter(zcb->zcb_spa, SCL_CONFIG, FTAG, RW_READER); + for (int i = 0; i < 4; i++) { int l = (i < 2) ? BP_GET_LEVEL(bp) : ZB_TOTAL; int t = (i & 1) ? type : ZDB_OT_TOTAL; @@ -2882,8 +2907,15 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t *zilog, const b switch (BP_GET_NDVAS(bp)) { case 2: if (DVA_GET_VDEV(&bp->blk_dva[0]) == - DVA_GET_VDEV(&bp->blk_dva[1])) + DVA_GET_VDEV(&bp->blk_dva[1])) { zb->zb_ditto_samevdev++; + + if (same_metaslab(zcb->zcb_spa, + DVA_GET_VDEV(&bp->blk_dva[0]), + DVA_GET_OFFSET(&bp->blk_dva[0]), + DVA_GET_OFFSET(&bp->blk_dva[1]))) + zb->zb_ditto_same_ms++; + } break; case 3: equal = (DVA_GET_VDEV(&bp->blk_dva[0]) == @@ -2892,13 +2924,37 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t *zilog, const b DVA_GET_VDEV(&bp->blk_dva[2])) + (DVA_GET_VDEV(&bp->blk_dva[1]) == DVA_GET_VDEV(&bp->blk_dva[2])); - if (equal != 0) + if (equal != 0) { zb->zb_ditto_samevdev++; + + if (DVA_GET_VDEV(&bp->blk_dva[0]) == + DVA_GET_VDEV(&bp->blk_dva[1]) && + same_metaslab(zcb->zcb_spa, + DVA_GET_VDEV(&bp->blk_dva[0]), + DVA_GET_OFFSET(&bp->blk_dva[0]), + DVA_GET_OFFSET(&bp->blk_dva[1]))) + zb->zb_ditto_same_ms++; + else if (DVA_GET_VDEV(&bp->blk_dva[0]) == + DVA_GET_VDEV(&bp->blk_dva[2]) && + same_metaslab(zcb->zcb_spa, + DVA_GET_VDEV(&bp->blk_dva[0]), + DVA_GET_OFFSET(&bp->blk_dva[0]), + DVA_GET_OFFSET(&bp->blk_dva[2]))) + zb->zb_ditto_same_ms++; + else if (DVA_GET_VDEV(&bp->blk_dva[1]) == + DVA_GET_VDEV(&bp->blk_dva[2]) && + same_metaslab(zcb->zcb_spa, + DVA_GET_VDEV(&bp->blk_dva[1]), + DVA_GET_OFFSET(&bp->blk_dva[1]), + DVA_GET_OFFSET(&bp->blk_dva[2]))) + zb->zb_ditto_same_ms++; + } break; } - } + spa_config_exit(zcb->zcb_spa, SCL_CONFIG, FTAG); + if (BP_IS_EMBEDDED(bp)) { zcb->zcb_embedded_blocks[BPE_GET_ETYPE(bp)]++; zcb->zcb_embedded_histogram[BPE_GET_ETYPE(bp)] @@ -3086,6 +3142,8 @@ zdb_ddt_leak_init(spa_t *spa, zdb_cb_t *zcb) ddt_entry_t dde; int error; + ASSERT(!dump_opt['L']); + bzero(&ddb, sizeof (ddb)); while ((error = ddt_walk(spa, &ddb, &dde)) == 0) { blkptr_t blk; @@ -3109,12 +3167,10 @@ zdb_ddt_leak_init(spa_t *spa, zdb_cb_t *zcb) zcb->zcb_dedup_blocks++; } } - if (!dump_opt['L']) { - ddt_t *ddt = spa->spa_ddt[ddb.ddb_checksum]; - ddt_enter(ddt); - VERIFY(ddt_lookup(ddt, &blk, B_TRUE) != NULL); - ddt_exit(ddt); - } + ddt_t *ddt = spa->spa_ddt[ddb.ddb_checksum]; + ddt_enter(ddt); + VERIFY(ddt_lookup(ddt, &blk, B_TRUE) != NULL); + ddt_exit(ddt); } ASSERT(error == ENOENT); @@ -3156,6 +3212,9 @@ claim_segment_cb(void *arg, uint64_t offset, uint64_t static void zdb_claim_removing(spa_t *spa, zdb_cb_t *zcb) { + if (dump_opt['L']) + return; + if (spa->spa_vdev_removal == NULL) return; @@ -3247,7 +3306,6 @@ zdb_load_obsolete_counts(vdev_t *vd) space_map_t *prev_obsolete_sm = NULL; VERIFY0(space_map_open(&prev_obsolete_sm, spa->spa_meta_objset, scip->scip_prev_obsolete_sm_object, 0, vd->vdev_asize, 0)); - space_map_update(prev_obsolete_sm); vdev_indirect_mapping_load_obsolete_spacemap(vim, counts, prev_obsolete_sm); space_map_close(prev_obsolete_sm); @@ -3341,9 +3399,9 @@ zdb_leak_init_vdev_exclude_checkpoint(vdev_t *vd, zdb_ VERIFY0(space_map_open(&checkpoint_sm, spa_meta_objset(spa), checkpoint_sm_obj, 0, vd->vdev_asize, vd->vdev_ashift)); - space_map_update(checkpoint_sm); VERIFY0(space_map_iterate(checkpoint_sm, + space_map_length(checkpoint_sm), checkpoint_sm_exclude_entry_cb, &cseea)); space_map_close(checkpoint_sm); @@ -3353,6 +3411,8 @@ zdb_leak_init_vdev_exclude_checkpoint(vdev_t *vd, zdb_ static void zdb_leak_init_exclude_checkpoint(spa_t *spa, zdb_cb_t *zcb) { + ASSERT(!dump_opt['L']); + vdev_t *rvd = spa->spa_root_vdev; for (uint64_t c = 0; c < rvd->vdev_children; c++) { ASSERT3U(c, ==, rvd->vdev_child[c]->vdev_id); @@ -3449,6 +3509,8 @@ load_indirect_ms_allocatable_tree(vdev_t *vd, metaslab static void zdb_leak_init_prepare_indirect_vdevs(spa_t *spa, zdb_cb_t *zcb) { + ASSERT(!dump_opt['L']); + vdev_t *rvd = spa->spa_root_vdev; for (uint64_t c = 0; c < rvd->vdev_children; c++) { vdev_t *vd = rvd->vdev_child[c]; @@ -3495,67 +3557,63 @@ zdb_leak_init(spa_t *spa, zdb_cb_t *zcb) { zcb->zcb_spa = spa; - if (!dump_opt['L']) { - dsl_pool_t *dp = spa->spa_dsl_pool; - vdev_t *rvd = spa->spa_root_vdev; + if (dump_opt['L']) + return; - /* - * We are going to be changing the meaning of the metaslab's - * ms_allocatable. Ensure that the allocator doesn't try to - * use the tree. - */ - spa->spa_normal_class->mc_ops = &zdb_metaslab_ops; - spa->spa_log_class->mc_ops = &zdb_metaslab_ops; + dsl_pool_t *dp = spa->spa_dsl_pool; + vdev_t *rvd = spa->spa_root_vdev; - zcb->zcb_vd_obsolete_counts = - umem_zalloc(rvd->vdev_children * sizeof (uint32_t *), - UMEM_NOFAIL); + /* + * We are going to be changing the meaning of the metaslab's + * ms_allocatable. Ensure that the allocator doesn't try to + * use the tree. + */ + spa->spa_normal_class->mc_ops = &zdb_metaslab_ops; + spa->spa_log_class->mc_ops = &zdb_metaslab_ops; - /* - * For leak detection, we overload the ms_allocatable trees - * to contain allocated segments instead of free segments. - * As a result, we can't use the normal metaslab_load/unload - * interfaces. - */ - zdb_leak_init_prepare_indirect_vdevs(spa, zcb); - load_concrete_ms_allocatable_trees(spa, SM_ALLOC); + zcb->zcb_vd_obsolete_counts = + umem_zalloc(rvd->vdev_children * sizeof (uint32_t *), + UMEM_NOFAIL); - /* - * On load_concrete_ms_allocatable_trees() we loaded all the - * allocated entries from the ms_sm to the ms_allocatable for - * each metaslab. If the pool has a checkpoint or is in the - * middle of discarding a checkpoint, some of these blocks - * may have been freed but their ms_sm may not have been - * updated because they are referenced by the checkpoint. In - * order to avoid false-positives during leak-detection, we - * go through the vdev's checkpoint space map and exclude all - * its entries from their relevant ms_allocatable. - * - * We also aggregate the space held by the checkpoint and add - * it to zcb_checkpoint_size. - * - * Note that at this point we are also verifying that all the - * entries on the checkpoint_sm are marked as allocated in - * the ms_sm of their relevant metaslab. - * [see comment in checkpoint_sm_exclude_entry_cb()] - */ - zdb_leak_init_exclude_checkpoint(spa, zcb); + /* + * For leak detection, we overload the ms_allocatable trees + * to contain allocated segments instead of free segments. + * As a result, we can't use the normal metaslab_load/unload + * interfaces. + */ + zdb_leak_init_prepare_indirect_vdevs(spa, zcb); + load_concrete_ms_allocatable_trees(spa, SM_ALLOC); - /* for cleaner progress output */ - (void) fprintf(stderr, "\n"); + /* + * On load_concrete_ms_allocatable_trees() we loaded all the + * allocated entries from the ms_sm to the ms_allocatable for + * each metaslab. If the pool has a checkpoint or is in the + * middle of discarding a checkpoint, some of these blocks + * may have been freed but their ms_sm may not have been + * updated because they are referenced by the checkpoint. In + * order to avoid false-positives during leak-detection, we + * go through the vdev's checkpoint space map and exclude all + * its entries from their relevant ms_allocatable. + * + * We also aggregate the space held by the checkpoint and add + * it to zcb_checkpoint_size. + * + * Note that at this point we are also verifying that all the + * entries on the checkpoint_sm are marked as allocated in + * the ms_sm of their relevant metaslab. + * [see comment in checkpoint_sm_exclude_entry_cb()] + */ + zdb_leak_init_exclude_checkpoint(spa, zcb); + ASSERT3U(zcb->zcb_checkpoint_size, ==, spa_get_checkpoint_space(spa)); - if (bpobj_is_open(&dp->dp_obsolete_bpobj)) { - ASSERT(spa_feature_is_enabled(spa, - SPA_FEATURE_DEVICE_REMOVAL)); - (void) bpobj_iterate_nofree(&dp->dp_obsolete_bpobj, - increment_indirect_mapping_cb, zcb, NULL); - } - } else { - /* - * If leak tracing is disabled, we still need to consider - * any checkpointed space in our space verification. - */ - zcb->zcb_checkpoint_size += spa_get_checkpoint_space(spa); + /* for cleaner progress output */ + (void) fprintf(stderr, "\n"); + + if (bpobj_is_open(&dp->dp_obsolete_bpobj)) { + ASSERT(spa_feature_is_enabled(spa, + SPA_FEATURE_DEVICE_REMOVAL)); + (void) bpobj_iterate_nofree(&dp->dp_obsolete_bpobj, + increment_indirect_mapping_cb, zcb, NULL); } spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER); @@ -3636,52 +3694,58 @@ zdb_check_for_obsolete_leaks(vdev_t *vd, zdb_cb_t *zcb static boolean_t zdb_leak_fini(spa_t *spa, zdb_cb_t *zcb) { + if (dump_opt['L']) + return (B_FALSE); + boolean_t leaks = B_FALSE; - if (!dump_opt['L']) { - vdev_t *rvd = spa->spa_root_vdev; - for (unsigned c = 0; c < rvd->vdev_children; c++) { - vdev_t *vd = rvd->vdev_child[c]; - metaslab_group_t *mg = vd->vdev_mg; - if (zcb->zcb_vd_obsolete_counts[c] != NULL) { - leaks |= zdb_check_for_obsolete_leaks(vd, zcb); - } + vdev_t *rvd = spa->spa_root_vdev; + for (unsigned c = 0; c < rvd->vdev_children; c++) { + vdev_t *vd = rvd->vdev_child[c]; +#if DEBUG + metaslab_group_t *mg = vd->vdev_mg; +#endif - for (uint64_t m = 0; m < vd->vdev_ms_count; m++) { - metaslab_t *msp = vd->vdev_ms[m]; - ASSERT3P(mg, ==, msp->ms_group); + if (zcb->zcb_vd_obsolete_counts[c] != NULL) { + leaks |= zdb_check_for_obsolete_leaks(vd, zcb); + } - /* - * ms_allocatable has been overloaded - * to contain allocated segments. Now that - * we finished traversing all blocks, any - * block that remains in the ms_allocatable - * represents an allocated block that we - * did not claim during the traversal. - * Claimed blocks would have been removed - * from the ms_allocatable. For indirect - * vdevs, space remaining in the tree - * represents parts of the mapping that are - * not referenced, which is not a bug. - */ - if (vd->vdev_ops == &vdev_indirect_ops) { - range_tree_vacate(msp->ms_allocatable, - NULL, NULL); - } else { - range_tree_vacate(msp->ms_allocatable, - zdb_leak, vd); - } + for (uint64_t m = 0; m < vd->vdev_ms_count; m++) { + metaslab_t *msp = vd->vdev_ms[m]; + ASSERT3P(mg, ==, msp->ms_group); - if (msp->ms_loaded) { - msp->ms_loaded = B_FALSE; - } + /* + * ms_allocatable has been overloaded + * to contain allocated segments. Now that + * we finished traversing all blocks, any + * block that remains in the ms_allocatable + * represents an allocated block that we + * did not claim during the traversal. + * Claimed blocks would have been removed + * from the ms_allocatable. For indirect + * vdevs, space remaining in the tree + * represents parts of the mapping that are + * not referenced, which is not a bug. + */ + if (vd->vdev_ops == &vdev_indirect_ops) { + range_tree_vacate(msp->ms_allocatable, + NULL, NULL); + } else { + range_tree_vacate(msp->ms_allocatable, + zdb_leak, vd); } + + if (msp->ms_loaded) { + msp->ms_loaded = B_FALSE; + } } - umem_free(zcb->zcb_vd_obsolete_counts, - rvd->vdev_children * sizeof (uint32_t *)); - zcb->zcb_vd_obsolete_counts = NULL; } + + umem_free(zcb->zcb_vd_obsolete_counts, + rvd->vdev_children * sizeof (uint32_t *)); + zcb->zcb_vd_obsolete_counts = NULL; + return (leaks); } @@ -3709,6 +3773,7 @@ dump_block_stats(spa_t *spa) uint64_t norm_alloc, norm_space, total_alloc, total_found; int flags = TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA | TRAVERSE_HARD; boolean_t leaks = B_FALSE; + int err; bzero(&zcb, sizeof (zcb)); (void) printf("\nTraversing all blocks %s%s%s%s%s...\n\n", @@ -3719,13 +3784,18 @@ dump_block_stats(spa_t *spa) !dump_opt['L'] ? "nothing leaked " : ""); /* - * Load all space maps as SM_ALLOC maps, then traverse the pool - * claiming each block we discover. If the pool is perfectly - * consistent, the space maps will be empty when we're done. - * Anything left over is a leak; any block we can't claim (because - * it's not part of any space map) is a double allocation, - * reference to a freed block, or an unclaimed log block. + * When leak detection is enabled we load all space maps as SM_ALLOC + * maps, then traverse the pool claiming each block we discover. If + * the pool is perfectly consistent, the segment trees will be empty + * when we're done. Anything left over is a leak; any block we can't + * claim (because it's not part of any space map) is a double + * allocation, reference to a freed block, or an unclaimed log block. + * + * When leak detection is disabled (-L option) we still traverse the + * pool claiming each block we discover, but we skip opening any space + * maps. */ + bzero(&zcb, sizeof (zdb_cb_t)); zdb_leak_init(spa, &zcb); /* @@ -3751,8 +3821,10 @@ dump_block_stats(spa_t *spa) flags |= TRAVERSE_PREFETCH_DATA; zcb.zcb_totalasize = metaslab_class_get_alloc(spa_normal_class(spa)); + zcb.zcb_totalasize += metaslab_class_get_alloc(spa_special_class(spa)); + zcb.zcb_totalasize += metaslab_class_get_alloc(spa_dedup_class(spa)); zcb.zcb_start = zcb.zcb_lastprint = gethrtime(); - zcb.zcb_haderrors |= traverse_pool(spa, 0, flags, zdb_blkptr_cb, &zcb); + err = traverse_pool(spa, 0, flags, zdb_blkptr_cb, &zcb); /* * If we've traversed the data blocks then we need to wait for those @@ -3768,6 +3840,12 @@ dump_block_stats(spa_t *spa) } } + /* + * Done after zio_wait() since zcb_haderrors is modified in + * zdb_blkptr_done() + */ + zcb.zcb_haderrors |= err; + if (zcb.zcb_haderrors) { (void) printf("\nError counts:\n\n"); (void) printf("\t%5s %s\n", "errno", "count"); @@ -3789,15 +3867,17 @@ dump_block_stats(spa_t *spa) norm_alloc = metaslab_class_get_alloc(spa_normal_class(spa)); norm_space = metaslab_class_get_space(spa_normal_class(spa)); - total_alloc = norm_alloc + metaslab_class_get_alloc(spa_log_class(spa)); + total_alloc = norm_alloc + + metaslab_class_get_alloc(spa_log_class(spa)) + + metaslab_class_get_alloc(spa_special_class(spa)) + + metaslab_class_get_alloc(spa_dedup_class(spa)); total_found = tzb->zb_asize - zcb.zcb_dedup_asize + zcb.zcb_removing_size + zcb.zcb_checkpoint_size; - if (total_found == total_alloc) { - if (!dump_opt['L']) - (void) printf("\n\tNo leaks (block sum matches space" - " maps exactly)\n"); - } else { + if (total_found == total_alloc && !dump_opt['L']) { + (void) printf("\n\tNo leaks (block sum matches space" + " maps exactly)\n"); + } else if (!dump_opt['L']) { (void) printf("block traversal size %llu != alloc %llu " "(%s %lld)\n", (u_longlong_t)total_found, @@ -3811,31 +3891,50 @@ dump_block_stats(spa_t *spa) return (2); (void) printf("\n"); - (void) printf("\tbp count: %10llu\n", + (void) printf("\t%-16s %14llu\n", "bp count:", (u_longlong_t)tzb->zb_count); - (void) printf("\tganged count: %10llu\n", + (void) printf("\t%-16s %14llu\n", "ganged count:", (longlong_t)tzb->zb_gangs); - (void) printf("\tbp logical: %10llu avg: %6llu\n", + (void) printf("\t%-16s %14llu avg: %6llu\n", "bp logical:", (u_longlong_t)tzb->zb_lsize, (u_longlong_t)(tzb->zb_lsize / tzb->zb_count)); - (void) printf("\tbp physical: %10llu avg:" - " %6llu compression: %6.2f\n", - (u_longlong_t)tzb->zb_psize, + (void) printf("\t%-16s %14llu avg: %6llu compression: %6.2f\n", + "bp physical:", (u_longlong_t)tzb->zb_psize, (u_longlong_t)(tzb->zb_psize / tzb->zb_count), (double)tzb->zb_lsize / tzb->zb_psize); - (void) printf("\tbp allocated: %10llu avg:" - " %6llu compression: %6.2f\n", - (u_longlong_t)tzb->zb_asize, + (void) printf("\t%-16s %14llu avg: %6llu compression: %6.2f\n", + "bp allocated:", (u_longlong_t)tzb->zb_asize, (u_longlong_t)(tzb->zb_asize / tzb->zb_count), (double)tzb->zb_lsize / tzb->zb_asize); - (void) printf("\tbp deduped: %10llu ref>1:" - " %6llu deduplication: %6.2f\n", - (u_longlong_t)zcb.zcb_dedup_asize, + (void) printf("\t%-16s %14llu ref>1: %6llu deduplication: %6.2f\n", + "bp deduped:", (u_longlong_t)zcb.zcb_dedup_asize, (u_longlong_t)zcb.zcb_dedup_blocks, (double)zcb.zcb_dedup_asize / tzb->zb_asize + 1.0); - (void) printf("\tSPA allocated: %10llu used: %5.2f%%\n", + (void) printf("\t%-16s %14llu used: %5.2f%%\n", "Normal class:", (u_longlong_t)norm_alloc, 100.0 * norm_alloc / norm_space); + if (spa_special_class(spa)->mc_rotor != NULL) { + uint64_t alloc = metaslab_class_get_alloc( + spa_special_class(spa)); + uint64_t space = metaslab_class_get_space( + spa_special_class(spa)); + + (void) printf("\t%-16s %14llu used: %5.2f%%\n", + "Special class", (u_longlong_t)alloc, + 100.0 * alloc / space); + } + + if (spa_dedup_class(spa)->mc_rotor != NULL) { + uint64_t alloc = metaslab_class_get_alloc( + spa_dedup_class(spa)); + uint64_t space = metaslab_class_get_space( + spa_dedup_class(spa)); + + (void) printf("\t%-16s %14llu used: %5.2f%%\n", + "Dedup class", (u_longlong_t)alloc, + 100.0 * alloc / space); + } + for (bp_embedded_type_t i = 0; i < NUM_BP_EMBEDDED_TYPES; i++) { if (zcb.zcb_embedded_blocks[i] == 0) continue; @@ -3857,6 +3956,10 @@ dump_block_stats(spa_t *spa) (void) printf("\tDittoed blocks on same vdev: %llu\n", (longlong_t)tzb->zb_ditto_samevdev); } + if (tzb->zb_ditto_same_ms != 0) { + (void) printf("\tDittoed blocks in same metaslab: %llu\n", + (longlong_t)tzb->zb_ditto_same_ms); + } for (uint64_t v = 0; v < spa->spa_root_vdev->vdev_children; v++) { vdev_t *vd = spa->spa_root_vdev->vdev_child[v]; @@ -4114,7 +4217,6 @@ verify_device_removal_feature_counts(spa_t *spa) spa->spa_meta_objset, scip->scip_prev_obsolete_sm_object, 0, vd->vdev_asize, 0)); - space_map_update(prev_obsolete_sm); dump_spacemap(spa->spa_meta_objset, prev_obsolete_sm); (void) printf("\n"); space_map_close(prev_obsolete_sm); @@ -4320,7 +4422,8 @@ verify_checkpoint_sm_entry_cb(space_map_entry_t *sme, * their respective ms_allocateable trees should not contain them. */ mutex_enter(&ms->ms_lock); - range_tree_verify(ms->ms_allocatable, sme->sme_offset, sme->sme_run); + range_tree_verify_not_present(ms->ms_allocatable, + sme->sme_offset, sme->sme_run); mutex_exit(&ms->ms_lock); return (0); @@ -4383,7 +4486,6 @@ verify_checkpoint_vdev_spacemaps(spa_t *checkpoint, sp VERIFY0(space_map_open(&checkpoint_sm, spa_meta_objset(current), checkpoint_sm_obj, 0, current_vd->vdev_asize, current_vd->vdev_ashift)); - space_map_update(checkpoint_sm); verify_checkpoint_sm_entry_cb_arg_t vcsec; vcsec.vcsec_vd = ckpoint_vd; @@ -4391,6 +4493,7 @@ verify_checkpoint_vdev_spacemaps(spa_t *checkpoint, sp vcsec.vcsec_num_entries = space_map_length(checkpoint_sm) / sizeof (uint64_t); VERIFY0(space_map_iterate(checkpoint_sm, + space_map_length(checkpoint_sm), verify_checkpoint_sm_entry_cb, &vcsec)); dump_spacemap(current->spa_meta_objset, checkpoint_sm); space_map_close(checkpoint_sm); @@ -4470,7 +4573,7 @@ verify_checkpoint_ms_spacemaps(spa_t *checkpoint, spa_ * are part of the checkpoint were freed by mistake. */ range_tree_walk(ckpoint_msp->ms_allocatable, - (range_tree_func_t *)range_tree_verify, + (range_tree_func_t *)range_tree_verify_not_present, current_msp->ms_allocatable); } } @@ -4482,6 +4585,8 @@ verify_checkpoint_ms_spacemaps(spa_t *checkpoint, spa_ static void verify_checkpoint_blocks(spa_t *spa) { + ASSERT(!dump_opt['L']); + spa_t *checkpoint_spa; char *checkpoint_pool; nvlist_t *config = NULL; @@ -4547,7 +4652,6 @@ dump_leftover_checkpoint_blocks(spa_t *spa) VERIFY0(space_map_open(&checkpoint_sm, spa_meta_objset(spa), checkpoint_sm_obj, 0, vd->vdev_asize, vd->vdev_ashift)); - space_map_update(checkpoint_sm); dump_spacemap(spa->spa_meta_objset, checkpoint_sm); space_map_close(checkpoint_sm); } Modified: stable/12/cddl/contrib/opensolaris/cmd/zfs/zfs.8 ============================================================================== --- stable/12/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Fri May 22 16:29:09 2020 (r361390) +++ stable/12/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Fri May 22 16:51:00 2020 (r361391) @@ -1134,8 +1134,23 @@ This feature must be enabled to be used .Po see .Xr zpool-features 7 .Pc . +.It Sy special_small_blocks Ns = Ns Ar size +This value represents the threshold block size for including small file +blocks into the special allocation class. +Blocks smaller than or equal to this value will be assigned to the special +allocation class while greater blocks will be assigned to the regular class. +Valid values are zero or a power of two from 512B up to 128K. +The default size is 0 which means no small file blocks will be allocated in +the special class. +.Pp +Before setting this property, a special class vdev must be added to the +pool. +See +.Xr zpool 8 +for more details on the special allocation class. .It Sy mountpoint Ns = Ns Ar path | Cm none | legacy -Controls the mount point used for this file system. See the +Controls the mount point used for this file system. +See the .Qq Sx Mount Points section for more information on how this property is used. .Pp @@ -3021,7 +3036,7 @@ property of the filesystem or volume which is received To use this flag, the storage pool must have the .Sy extensible_dataset feature enabled. See -.Xr zpool-features 5 +.Xr zpool-features 7 for details on ZFS feature flags. .El .It Xo Modified: stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 ============================================================================== --- stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 Fri May 22 16:29:09 2020 (r361390) +++ stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 Fri May 22 16:51:00 2020 (r361391) @@ -632,6 +632,25 @@ and will return to being once all filesystems that have ever had their checksum set to .Sy skein are destroyed. +.It Sy allocation_classes +.Bl -column "READ\-ONLY COMPATIBLE" "com.intel:allocation_classes" +.It GUID Ta com.intel:allocation_classes +.It READ\-ONLY COMPATIBLE Ta yes +.It DEPENDENCIES Ta none +.El +.Pp +This feature enables support for separate allocation classes. +.Pp +This feature becomes +.Sy active +when a dedicated allocation class vdev +(dedup or special) is created with +.Dq zpool create +or +.Dq zpool add . +With device removal, it can be returned to the +.Sy enabled +state if all the top-level vdevs from an allocation class are removed. .El .Sh SEE ALSO .Xr zpool 8 Modified: stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool.8 ============================================================================== --- stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool.8 Fri May 22 16:29:09 2020 (r361390) +++ stable/12/cddl/contrib/opensolaris/cmd/zpool/zpool.8 Fri May 22 16:51:00 2020 (r361391) @@ -24,6 +24,8 @@ .\" Copyright (c) 2012, 2017 by Delphix. All Rights Reserved. .\" Copyright 2017 Nexenta Systems, Inc. .\" Copyright (c) 2017 Datto Inc. +.\" Copyright (c) 2017 George Melikov. All Rights Reserved. +.\" Copyright 2019 Joyent, Inc. .\" .\" $FreeBSD$ .\" @@ -38,7 +40,7 @@ .Op Fl \&? .Nm .Cm add -.Op Fl fn +.Op Fl fgLnP .Ar pool vdev ... .Nm .Cm attach @@ -127,17 +129,19 @@ .Op Ar device Ns ... .Nm .Cm iostat -.Op Fl T Cm d Ns | Ns Cm u .Op Fl v +.Op Fl T Cm d Ns | Ns Cm u +.Op Fl gLP .Op Ar pool .Ar ... +.Op Ar inverval Op Ar count .Nm .Cm labelclear .Op Fl f .Ar device .Nm .Cm list -.Op Fl Hpv +.Op Fl HgLpPv .Op Fl o Ar property Ns Op , Ns Ar ... .Op Fl T Cm d Ns | Ns Cm u .Op Ar pool @@ -179,7 +183,7 @@ .Ar property Ns = Ns Ar value pool .Nm .Cm split -.Op Fl n +.Op Fl gLnP .Op Fl R Ar altroot .Op Fl o Ar mntopts .Op Fl o Ar property Ns = Ns Ar value @@ -187,7 +191,7 @@ .Op Ar device ... .Nm .Cm status -.Op Fl Dvx +.Op Fl DgLPvx .Op Fl T Cm d Ns | Ns Cm u .Op Ar pool .Ar ... @@ -320,11 +324,27 @@ types are not supported for the intent log. For more i see the .Qq Sx Intent Log section. +.It Sy dedup +A device dedicated solely for allocating dedup data. +The redundancy of this device should match the redundancy of the other normal +devices in the pool. +If more than one dedup device is specified, then allocations are load-balanced +between devices. +.It Sy special +A device dedicated solely for allocating various kinds of internal metadata, +and optionally small file data. +The redundancy of this device should match the redundancy of the other normal +devices in the pool. +If more than one special device is specified, then allocations are +load-balanced between devices. +.Pp +For more information on special allocations, see the +.Sx Special Allocation Class +section. .It Sy cache -A device used to cache storage pool data. A cache device cannot be configured -as a mirror or -.No raidz -group. For more information, see the +A device used to cache storage pool data. +A cache device cannot be configured as a mirror or raidz group. +For more information, see the .Qq Sx Cache Devices section. .El @@ -602,6 +622,31 @@ zfs properties) may be unenforceable while a checkpoin checkpoint is allowed to consume the dataset's reservation. Finally, data that is part of the checkpoint but has been freed in the current state of the pool won't be scanned during a scrub. +.Ss Special Allocation Class +The allocations in the special class are dedicated to specific block types. +By default this includes all metadata, the indirect blocks of user data, and +any dedup data. +The class can also be provisioned to accept a limited percentage of small file +data blocks. +.Pp +A pool must always have at least one general (non-specified) vdev before +other devices can be assigned to the special class. +If the special class becomes full, then allocations intended for it will spill +back into the normal class. +.Pp +Dedup data can be excluded from the special class by setting the +.Sy vfs.zfs.ddt_data_is_special +sysctl to false (0). +.Pp +Inclusion of small file blocks in the special class is opt-in. +Each dataset can control the size of small file blocks allowed in the special +class by setting the +.Sy special_small_blocks +dataset property. +It defaults to zero so you must opt-in by setting it to a non-zero value. +See +.Xr zfs 1M +for more info on setting this property. .Ss Properties Each pool has several properties associated with it. Some properties are read-only statistics while others are configurable and change the behavior of @@ -872,7 +917,7 @@ Displays a help message. .It Xo .Nm .Cm add -.Op Fl fn +.Op Fl fgLnP .Ar pool vdev ... .Xc .Pp @@ -891,11 +936,30 @@ Forces use of .Ar vdev , even if they appear in use or specify a conflicting replication level. Not all devices can be overridden in this manner. +.It Fl g +Display +.Ar vdev , +GUIDs instead of the normal device names. +These GUIDs can be used in place of +device names for the zpool detach/offline/remove/replace commands. +.It Fl L +Display real paths for +.Ar vdev Ns s +resolving all symbolic links. +This can be used to look up the current block +device name regardless of the /dev/disk/ path used to open it. .It Fl n Displays the configuration that would be used without actually adding the .Ar vdev Ns s. -The actual pool creation can still fail due to insufficient privileges or device -sharing. +The actual pool creation can still fail due to insufficient privileges or +device sharing. +.It Fl P +Display real paths for +.Ar vdev Ns s +instead of only the last component of the path. +This can be used in conjunction with the +.Fl L +flag. .El .It Xo .Nm @@ -1512,7 +1576,7 @@ with no flags on the relevant target devices. .Nm .Cm iostat .Op Fl T Cm d Ns | Ns Cm u -.Op Fl v +.Op Fl gLPv .Op Ar pool .Ar ... .Op Ar interval Op Ar count @@ -1544,10 +1608,25 @@ Use modifier .Cm u for unixtime .Pq equals Qq Ic date +%s . +.It Fl g +Display vdev GUIDs instead of the normal device names. +These GUIDs can be used in place of device names for the zpool +detach/offline/remove/replace commands. +.It Fl L +Display real paths for vdevs resolving all symbolic links. +This can be used to look up the current block device name regardless of the +.Pa /dev/disk/ +path used to open it. +.It Fl P +Display full paths for vdevs instead of only the last component of +the path. +This can be used in conjunction with the +.Fl L +flag. .It Fl v -Verbose statistics. Reports usage statistics for individual -.No vdev Ns s -within the pool, in addition to the pool-wide statistics. +Verbose statistics. *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***