Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Jul 2018 14:42:48 -0700
From:      Jim Long <list@museum.rain.com>
To:        Mike Tancsa <mike@sentex.net>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Disk/ZFS activity crash on 11.2-STABLE
Message-ID:  <20180712214248.GA98578@g5.umpquanet.com>
In-Reply-To: <a069a076-df1c-80b2-1116-787e0a948ed9@sentex.net>
References:  <20180711212959.GA81029@g5.umpquanet.com> <5ebd8573-1363-06c7-cbb2-8298b0894319@sentex.net> <20180712183512.GA75020@g5.umpquanet.com> <a069a076-df1c-80b2-1116-787e0a948ed9@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 12, 2018 at 02:49:53PM -0400, Mike Tancsa wrote:

--snip--

> I would try and set a ceiling. On
> RELENG_11 you dont need to reboot
> 
> Try
> sysctl -w vfs.zfs.arc_max=77946198016
> 
> which shaves off 20G from what ARC can gobble up. Not sure if thats your
> issue, but it is an issue for some users.
> 
> If you are still hurting for caching, an SSD drive or NVME and make it a
> caching device for your pool.
> 
> and what does
> zpool status
> show ?

I set the limit to the value you suggested, and the next test ran less
than three minutes before the machine rebooted, with no crash dump produced.

I further reduced the limit to 50G and it's been running for about 50 minutes
so far.  Fingers crossed.  I do have L2ARC I can add if need be.

I'll keep you posted on how this run goes.

Thank you,

Jim



# zpool list; echo; zpool status
NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
mendeleev  21.6T  5.02T  16.6T        -         -     0%    23%  1.00x  ONLINE  -

  pool: mendeleev
 state: ONLINE
  scan: scrub repaired 0 in 3h39m with 0 errors on Wed Jun  6 18:24:08 2018
config:

        NAME          STATE     READ WRITE CKSUM
        mendeleev     ONLINE       0     0     0
          raidz2-0    ONLINE       0     0     0
            gpt/zfs0  ONLINE       0     0     0
            gpt/zfs1  ONLINE       0     0     0
            gpt/zfs2  ONLINE       0     0     0
            gpt/zfs3  ONLINE       0     0     0
            gpt/zfs4  ONLINE       0     0     0
            gpt/zfs5  ONLINE       0     0     0

errors: No known data errors


Here's zfs-stats after about 50 min of activity.

# zfs-stats -a

------------------------------------------------------------------------
ZFS Subsystem Report                            Thu Jul 12 14:38:04 2018
------------------------------------------------------------------------

System Information:

        Kernel Version:                         1102501 (osreldate)
        Hardware Platform:                      amd64
        Processor Architecture:                 amd64

        ZFS Storage pool Version:               5000
        ZFS Filesystem Version:                 5

FreeBSD 11.2-STABLE #0 r335674: Tue Jun 26 13:20:24 PDT 2018 root
 2:38PM  up 55 mins, 2 users, load averages: 3.44, 3.88, 3.83

------------------------------------------------------------------------

System Memory:

        0.02%   14.62   MiB Active,     0.02%   21.34   MiB Inact
        62.00%  57.18   GiB Wired,      0.00%   0 Cache
        37.96%  35.01   GiB Free,       -0.00%  -204800 Bytes Gap

        Real Installed:                         96.00   GiB
        Real Available:                 98.57%  94.63   GiB
        Real Managed:                   97.45%  92.22   GiB

        Logical Total:                          96.00   GiB
        Logical Used:                   63.51%  60.97   GiB
        Logical Free:                   36.49%  35.03   GiB

Kernel Memory:                                  968.90  MiB
        Data:                           96.26%  932.67  MiB
        Text:                           3.74%   36.24   MiB

Kernel Memory Map:                              92.22   GiB
        Size:                           1.70%   1.57    GiB
        Free:                           98.30%  90.65   GiB

------------------------------------------------------------------------

ARC Summary: (HEALTHY)
        Memory Throttle Count:                  0

ARC Misc:
        Deleted:                                17.53m
        Recycle Misses:                         0
        Mutex Misses:                           46.41k
        Evict Skips:                            26.28k

ARC Size:                               100.02% 50.01   GiB
        Target Size: (Adaptive)         100.00% 50.00   GiB
        Min Size (Hard Limit):          22.80%  11.40   GiB
        Max Size (High Water):          4:1     50.00   GiB

ARC Size Breakdown:
        Recently Used Cache Size:       99.19%  49.60   GiB
        Frequently Used Cache Size:     0.81%   416.12  MiB

ARC Hash Breakdown:
        Elements Max:                           3.96m
        Elements Current:               95.60%  3.79m
        Collisions:                             3.22m
        Chain Max:                              5
        Chains:                                 367.37k

------------------------------------------------------------------------

ARC Efficiency:                                 63.22m
        Cache Hit Ratio:                43.75%  27.66m
        Cache Miss Ratio:               56.25%  35.56m
        Actual Hit Ratio:               39.39%  24.90m

        Data Demand Efficiency:         69.68%  26.96m
        Data Prefetch Efficiency:       0.00%   18.77m

        CACHE HITS BY CACHE LIST:
          Anonymously Used:             9.95%   2.75m
          Most Recently Used:           87.72%  24.26m
          Most Frequently Used:         2.32%   640.73k
          Most Recently Used Ghost:     0.00%   0
          Most Frequently Used Ghost:   0.01%   2.43k

        CACHE HITS BY DATA TYPE:
          Demand Data:                  67.92%  18.79m
          Prefetch Data:                0.00%   8
          Demand Metadata:              22.12%  6.12m
          Prefetch Metadata:            9.96%   2.76m

        CACHE MISSES BY DATA TYPE:
          Demand Data:                  22.99%  8.18m
          Prefetch Data:                52.77%  18.77m
          Demand Metadata:              17.07%  6.07m
          Prefetch Metadata:            7.17%   2.55m

------------------------------------------------------------------------

L2ARC is disabled

------------------------------------------------------------------------

File-Level Prefetch: (HEALTHY)

DMU Efficiency:                                 137.36k
        Hit Ratio:                      0.34%   470
        Miss Ratio:                     99.66%  136.89k

        Colinear:                               0
          Hit Ratio:                    100.00% 0
          Miss Ratio:                   100.00% 0

        Stride:                                 0
          Hit Ratio:                    100.00% 0
          Miss Ratio:                   100.00% 0

DMU Misc:
        Reclaim:                                0
          Successes:                    100.00% 0
          Failures:                     100.00% 0

        Streams:                                0
          +Resets:                      100.00% 0
          -Resets:                      100.00% 0
          Bogus:                                0

------------------------------------------------------------------------

VDEV cache is disabled

------------------------------------------------------------------------

ZFS Tunables (sysctl):
        kern.maxusers                           6392
        vm.kmem_size                            99019939840
        vm.kmem_size_scale                      1
        vm.kmem_size_min                        0
        vm.kmem_size_max                        1319413950874
        vfs.zfs.trim.max_interval               1
        vfs.zfs.trim.timeout                    30
        vfs.zfs.trim.txg_delay                  32
        vfs.zfs.trim.enabled                    1
        vfs.zfs.vol.immediate_write_sz          32768
        vfs.zfs.vol.unmap_sync_enabled          0
        vfs.zfs.vol.unmap_enabled               1
        vfs.zfs.vol.recursive                   0
        vfs.zfs.vol.mode                        1
        vfs.zfs.version.zpl                     5
        vfs.zfs.version.spa                     5000
        vfs.zfs.version.acl                     1
        vfs.zfs.version.ioctl                   7
        vfs.zfs.debug                           0
        vfs.zfs.super_owner                     0
        vfs.zfs.immediate_write_sz              32768
        vfs.zfs.sync_pass_rewrite               2
        vfs.zfs.sync_pass_dont_compress         5
        vfs.zfs.sync_pass_deferred_free         2
        vfs.zfs.zio.dva_throttle_enabled        1
        vfs.zfs.zio.exclude_metadata            0
        vfs.zfs.zio.use_uma                     1
        vfs.zfs.zil_slog_bulk                   786432
        vfs.zfs.cache_flush_disable             0
        vfs.zfs.zil_replay_disable              0
        vfs.zfs.standard_sm_blksz               131072
        vfs.zfs.dtl_sm_blksz                    4096
        vfs.zfs.min_auto_ashift                 12
        vfs.zfs.max_auto_ashift                 13
        vfs.zfs.vdev.trim_max_pending           10000
        vfs.zfs.vdev.bio_delete_disable         0
        vfs.zfs.vdev.bio_flush_disable          0
        vfs.zfs.vdev.queue_depth_pct            1000
        vfs.zfs.vdev.write_gap_limit            4096
        vfs.zfs.vdev.read_gap_limit             32768
        vfs.zfs.vdev.aggregation_limit          131072
        vfs.zfs.vdev.trim_max_active            64
        vfs.zfs.vdev.trim_min_active            1
        vfs.zfs.vdev.scrub_max_active           2
        vfs.zfs.vdev.scrub_min_active           1
        vfs.zfs.vdev.async_write_max_active     10
        vfs.zfs.vdev.async_write_min_active     1
        vfs.zfs.vdev.async_read_max_active      3
        vfs.zfs.vdev.async_read_min_active      1
        vfs.zfs.vdev.sync_write_max_active      10
        vfs.zfs.vdev.sync_write_min_active      10
        vfs.zfs.vdev.sync_read_max_active       10
        vfs.zfs.vdev.sync_read_min_active       10
        vfs.zfs.vdev.max_active                 1000
        vfs.zfs.vdev.async_write_active_max_dirty_percent60
        vfs.zfs.vdev.async_write_active_min_dirty_percent30
        vfs.zfs.vdev.mirror.non_rotating_seek_inc1
        vfs.zfs.vdev.mirror.non_rotating_inc    0
        vfs.zfs.vdev.mirror.rotating_seek_offset1048576
        vfs.zfs.vdev.mirror.rotating_seek_inc   5
        vfs.zfs.vdev.mirror.rotating_inc        0
        vfs.zfs.vdev.trim_on_init               1
        vfs.zfs.vdev.cache.bshift               16
        vfs.zfs.vdev.cache.size                 0
        vfs.zfs.vdev.cache.max                  16384
        vfs.zfs.vdev.default_ms_shift           29
        vfs.zfs.vdev.min_ms_count               16
        vfs.zfs.vdev.max_ms_count               200
        vfs.zfs.txg.timeout                     5
        vfs.zfs.spa_min_slop                    134217728
        vfs.zfs.spa_slop_shift                  5
        vfs.zfs.spa_asize_inflation             24
        vfs.zfs.deadman_enabled                 0
        vfs.zfs.deadman_checktime_ms            5000
        vfs.zfs.deadman_synctime_ms             1000000
        vfs.zfs.debug_flags                     0
        vfs.zfs.debugflags                      0
        vfs.zfs.recover                         0
        vfs.zfs.spa_load_verify_data            1
        vfs.zfs.spa_load_verify_metadata        1
        vfs.zfs.spa_load_verify_maxinflight     10000
        vfs.zfs.max_missing_tvds_scan           0
        vfs.zfs.max_missing_tvds_cachefile      2
        vfs.zfs.max_missing_tvds                0
        vfs.zfs.spa_load_print_vdev_tree        0
        vfs.zfs.ccw_retry_interval              300
        vfs.zfs.check_hostid                    1
        vfs.zfs.mg_fragmentation_threshold      85
        vfs.zfs.mg_noalloc_threshold            0
        vfs.zfs.condense_pct                    200
        vfs.zfs.metaslab_sm_blksz               4096
        vfs.zfs.metaslab.bias_enabled           1
        vfs.zfs.metaslab.lba_weighting_enabled  1
        vfs.zfs.metaslab.fragmentation_factor_enabled1
        vfs.zfs.metaslab.preload_enabled        1
        vfs.zfs.metaslab.preload_limit          3
        vfs.zfs.metaslab.unload_delay           8
        vfs.zfs.metaslab.load_pct               50
        vfs.zfs.metaslab.min_alloc_size         33554432
        vfs.zfs.metaslab.df_free_pct            4
        vfs.zfs.metaslab.df_alloc_threshold     131072
        vfs.zfs.metaslab.debug_unload           0
        vfs.zfs.metaslab.debug_load             0
        vfs.zfs.metaslab.fragmentation_threshold70
        vfs.zfs.metaslab.force_ganging          16777217
        vfs.zfs.free_bpobj_enabled              1
        vfs.zfs.free_max_blocks                 -1
        vfs.zfs.no_scrub_prefetch               0
        vfs.zfs.no_scrub_io                     0
        vfs.zfs.resilver_min_time_ms            3000
        vfs.zfs.free_min_time_ms                1000
        vfs.zfs.scan_min_time_ms                1000
        vfs.zfs.scan_idle                       50
        vfs.zfs.scrub_delay                     4
        vfs.zfs.resilver_delay                  2
        vfs.zfs.top_maxinflight                 32
        vfs.zfs.zfetch.array_rd_sz              1048576
        vfs.zfs.zfetch.max_idistance            67108864
        vfs.zfs.zfetch.max_distance             8388608
        vfs.zfs.zfetch.min_sec_reap             2
        vfs.zfs.zfetch.max_streams              8
        vfs.zfs.prefetch_disable                0
        vfs.zfs.delay_scale                     500000
        vfs.zfs.delay_min_dirty_percent         60
        vfs.zfs.dirty_data_sync                 67108864
        vfs.zfs.dirty_data_max_percent          10
        vfs.zfs.dirty_data_max_max              4294967296
        vfs.zfs.dirty_data_max                  4294967296
        vfs.zfs.max_recordsize                  1048576
        vfs.zfs.default_ibs                     17
        vfs.zfs.default_bs                      9
        vfs.zfs.send_holes_without_birth_time   1
        vfs.zfs.mdcomp_disable                  0
        vfs.zfs.per_txg_dirty_frees_percent     30
        vfs.zfs.nopwrite_enabled                1
        vfs.zfs.dedup.prefetch                  1
        vfs.zfs.dbuf_cache_lowater_pct          10
        vfs.zfs.dbuf_cache_hiwater_pct          10
        vfs.zfs.dbuf_cache_shift                5
        vfs.zfs.dbuf_cache_max_bytes            3060818688
        vfs.zfs.l2c_only_size                   0
        vfs.zfs.mfu_ghost_data_esize            44354878976
        vfs.zfs.mfu_ghost_metadata_esize        8191800320
        vfs.zfs.mfu_ghost_size                  52546679296
        vfs.zfs.mfu_data_esize                  0
        vfs.zfs.mfu_metadata_esize              84992
        vfs.zfs.mfu_size                        64779776
        vfs.zfs.mru_ghost_data_esize            48840192
        vfs.zfs.mru_ghost_metadata_esize        926682624
        vfs.zfs.mru_ghost_size                  975522816
        vfs.zfs.mru_data_esize                  47824171520
        vfs.zfs.mru_metadata_esize              5086352896
        vfs.zfs.mru_size                        52926374400
        vfs.zfs.anon_data_esize                 0
        vfs.zfs.anon_metadata_esize             0
        vfs.zfs.anon_size                       69500928
        vfs.zfs.l2arc_norw                      1
        vfs.zfs.l2arc_feed_again                1
        vfs.zfs.l2arc_noprefetch                1
        vfs.zfs.l2arc_feed_min_ms               200
        vfs.zfs.l2arc_feed_secs                 1
        vfs.zfs.l2arc_headroom                  2
        vfs.zfs.l2arc_write_boost               8388608
        vfs.zfs.l2arc_write_max                 8388608
        vfs.zfs.arc_meta_limit                  13421772800
        vfs.zfs.arc_free_target                 167651
        vfs.zfs.compressed_arc_enabled          1
        vfs.zfs.arc_grow_retry                  60
        vfs.zfs.arc_shrink_shift                7
        vfs.zfs.arc_average_blocksize           8192
        vfs.zfs.arc_no_grow_shift               5
        vfs.zfs.arc_min                         12243274752
        vfs.zfs.arc_max                         53687091200
        vfs.zfs.abd_chunk_size                  4096

------------------------------------------------------------------------


A few iterations of 'zpool iostat -v':

                 capacity     operations    bandwidth
pool          alloc   free   read  write   read  write
------------  -----  -----  -----  -----  -----  -----
mendeleev     5.02T  16.6T  3.58K      0   389M      0
  raidz2      5.02T  16.6T  3.58K      0   389M      0
    gpt/zfs0      -      -    783      0  45.9M      0
    gpt/zfs1      -      -  1.29K      0   100M      0
    gpt/zfs2      -      -    711      0  54.9M      0
    gpt/zfs3      -      -    729      0  54.3M      0
    gpt/zfs4      -      -  1.23K      0  99.7M      0
    gpt/zfs5      -      -    752      0  46.3M      0
------------  -----  -----  -----  -----  -----  -----

                 capacity     operations    bandwidth
pool          alloc   free   read  write   read  write
------------  -----  -----  -----  -----  -----  -----
mendeleev     5.02T  16.6T  8.71K      0   286M      0
  raidz2      5.02T  16.6T  8.71K      0   286M      0
    gpt/zfs0      -      -    715      0  35.6M      0
    gpt/zfs1      -      -  2.15K      0  82.7M      0
    gpt/zfs2      -      -    791      0  34.4M      0
    gpt/zfs3      -      -    706      0  33.6M      0
    gpt/zfs4      -      -  2.29K      0  82.3M      0
    gpt/zfs5      -      -    693      0  37.0M      0
------------  -----  -----  -----  -----  -----  -----

                 capacity     operations    bandwidth
pool          alloc   free   read  write   read  write
------------  -----  -----  -----  -----  -----  -----
mendeleev     5.02T  16.6T  6.28K      0   775M      0
  raidz2      5.02T  16.6T  6.28K      0   775M      0
    gpt/zfs0      -      -  1.37K      0   112M      0
    gpt/zfs1      -      -  1.57K      0   194M      0
    gpt/zfs2      -      -    802      0  87.0M      0
    gpt/zfs3      -      -    851      0  86.9M      0
    gpt/zfs4      -      -  2.03K      0   199M      0
    gpt/zfs5      -      -  1.09K      0   112M      0
------------  -----  -----  -----  -----  -----  -----

                 capacity     operations    bandwidth
pool          alloc   free   read  write   read  write
------------  -----  -----  -----  -----  -----  -----
mendeleev     5.02T  16.6T  9.15K      0   272M      0
  raidz2      5.02T  16.6T  9.15K      0   272M      0
    gpt/zfs0      -      -    507      0  23.1M      0
    gpt/zfs1      -      -  2.77K      0  80.7M      0
    gpt/zfs2      -      -    727      0  41.8M      0
    gpt/zfs3      -      -    707      0  41.2M      0
    gpt/zfs4      -      -  2.58K      0  76.1M      0
    gpt/zfs5      -      -    509      0  23.6M      0
------------  -----  -----  -----  -----  -----  -----





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180712214248.GA98578>