Date: Thu, 12 Jul 2018 14:42:48 -0700 From: Jim Long <list@museum.rain.com> To: Mike Tancsa <mike@sentex.net> Cc: freebsd-questions@freebsd.org Subject: Re: Disk/ZFS activity crash on 11.2-STABLE Message-ID: <20180712214248.GA98578@g5.umpquanet.com> In-Reply-To: <a069a076-df1c-80b2-1116-787e0a948ed9@sentex.net> References: <20180711212959.GA81029@g5.umpquanet.com> <5ebd8573-1363-06c7-cbb2-8298b0894319@sentex.net> <20180712183512.GA75020@g5.umpquanet.com> <a069a076-df1c-80b2-1116-787e0a948ed9@sentex.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 12, 2018 at 02:49:53PM -0400, Mike Tancsa wrote: --snip-- > I would try and set a ceiling. On > RELENG_11 you dont need to reboot > > Try > sysctl -w vfs.zfs.arc_max=77946198016 > > which shaves off 20G from what ARC can gobble up. Not sure if thats your > issue, but it is an issue for some users. > > If you are still hurting for caching, an SSD drive or NVME and make it a > caching device for your pool. > > and what does > zpool status > show ? I set the limit to the value you suggested, and the next test ran less than three minutes before the machine rebooted, with no crash dump produced. I further reduced the limit to 50G and it's been running for about 50 minutes so far. Fingers crossed. I do have L2ARC I can add if need be. I'll keep you posted on how this run goes. Thank you, Jim # zpool list; echo; zpool status NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT mendeleev 21.6T 5.02T 16.6T - - 0% 23% 1.00x ONLINE - pool: mendeleev state: ONLINE scan: scrub repaired 0 in 3h39m with 0 errors on Wed Jun 6 18:24:08 2018 config: NAME STATE READ WRITE CKSUM mendeleev ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/zfs0 ONLINE 0 0 0 gpt/zfs1 ONLINE 0 0 0 gpt/zfs2 ONLINE 0 0 0 gpt/zfs3 ONLINE 0 0 0 gpt/zfs4 ONLINE 0 0 0 gpt/zfs5 ONLINE 0 0 0 errors: No known data errors Here's zfs-stats after about 50 min of activity. # zfs-stats -a ------------------------------------------------------------------------ ZFS Subsystem Report Thu Jul 12 14:38:04 2018 ------------------------------------------------------------------------ System Information: Kernel Version: 1102501 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 ZFS Storage pool Version: 5000 ZFS Filesystem Version: 5 FreeBSD 11.2-STABLE #0 r335674: Tue Jun 26 13:20:24 PDT 2018 root 2:38PM up 55 mins, 2 users, load averages: 3.44, 3.88, 3.83 ------------------------------------------------------------------------ System Memory: 0.02% 14.62 MiB Active, 0.02% 21.34 MiB Inact 62.00% 57.18 GiB Wired, 0.00% 0 Cache 37.96% 35.01 GiB Free, -0.00% -204800 Bytes Gap Real Installed: 96.00 GiB Real Available: 98.57% 94.63 GiB Real Managed: 97.45% 92.22 GiB Logical Total: 96.00 GiB Logical Used: 63.51% 60.97 GiB Logical Free: 36.49% 35.03 GiB Kernel Memory: 968.90 MiB Data: 96.26% 932.67 MiB Text: 3.74% 36.24 MiB Kernel Memory Map: 92.22 GiB Size: 1.70% 1.57 GiB Free: 98.30% 90.65 GiB ------------------------------------------------------------------------ ARC Summary: (HEALTHY) Memory Throttle Count: 0 ARC Misc: Deleted: 17.53m Recycle Misses: 0 Mutex Misses: 46.41k Evict Skips: 26.28k ARC Size: 100.02% 50.01 GiB Target Size: (Adaptive) 100.00% 50.00 GiB Min Size (Hard Limit): 22.80% 11.40 GiB Max Size (High Water): 4:1 50.00 GiB ARC Size Breakdown: Recently Used Cache Size: 99.19% 49.60 GiB Frequently Used Cache Size: 0.81% 416.12 MiB ARC Hash Breakdown: Elements Max: 3.96m Elements Current: 95.60% 3.79m Collisions: 3.22m Chain Max: 5 Chains: 367.37k ------------------------------------------------------------------------ ARC Efficiency: 63.22m Cache Hit Ratio: 43.75% 27.66m Cache Miss Ratio: 56.25% 35.56m Actual Hit Ratio: 39.39% 24.90m Data Demand Efficiency: 69.68% 26.96m Data Prefetch Efficiency: 0.00% 18.77m CACHE HITS BY CACHE LIST: Anonymously Used: 9.95% 2.75m Most Recently Used: 87.72% 24.26m Most Frequently Used: 2.32% 640.73k Most Recently Used Ghost: 0.00% 0 Most Frequently Used Ghost: 0.01% 2.43k CACHE HITS BY DATA TYPE: Demand Data: 67.92% 18.79m Prefetch Data: 0.00% 8 Demand Metadata: 22.12% 6.12m Prefetch Metadata: 9.96% 2.76m CACHE MISSES BY DATA TYPE: Demand Data: 22.99% 8.18m Prefetch Data: 52.77% 18.77m Demand Metadata: 17.07% 6.07m Prefetch Metadata: 7.17% 2.55m ------------------------------------------------------------------------ L2ARC is disabled ------------------------------------------------------------------------ File-Level Prefetch: (HEALTHY) DMU Efficiency: 137.36k Hit Ratio: 0.34% 470 Miss Ratio: 99.66% 136.89k Colinear: 0 Hit Ratio: 100.00% 0 Miss Ratio: 100.00% 0 Stride: 0 Hit Ratio: 100.00% 0 Miss Ratio: 100.00% 0 DMU Misc: Reclaim: 0 Successes: 100.00% 0 Failures: 100.00% 0 Streams: 0 +Resets: 100.00% 0 -Resets: 100.00% 0 Bogus: 0 ------------------------------------------------------------------------ VDEV cache is disabled ------------------------------------------------------------------------ ZFS Tunables (sysctl): kern.maxusers 6392 vm.kmem_size 99019939840 vm.kmem_size_scale 1 vm.kmem_size_min 0 vm.kmem_size_max 1319413950874 vfs.zfs.trim.max_interval 1 vfs.zfs.trim.timeout 30 vfs.zfs.trim.txg_delay 32 vfs.zfs.trim.enabled 1 vfs.zfs.vol.immediate_write_sz 32768 vfs.zfs.vol.unmap_sync_enabled 0 vfs.zfs.vol.unmap_enabled 1 vfs.zfs.vol.recursive 0 vfs.zfs.vol.mode 1 vfs.zfs.version.zpl 5 vfs.zfs.version.spa 5000 vfs.zfs.version.acl 1 vfs.zfs.version.ioctl 7 vfs.zfs.debug 0 vfs.zfs.super_owner 0 vfs.zfs.immediate_write_sz 32768 vfs.zfs.sync_pass_rewrite 2 vfs.zfs.sync_pass_dont_compress 5 vfs.zfs.sync_pass_deferred_free 2 vfs.zfs.zio.dva_throttle_enabled 1 vfs.zfs.zio.exclude_metadata 0 vfs.zfs.zio.use_uma 1 vfs.zfs.zil_slog_bulk 786432 vfs.zfs.cache_flush_disable 0 vfs.zfs.zil_replay_disable 0 vfs.zfs.standard_sm_blksz 131072 vfs.zfs.dtl_sm_blksz 4096 vfs.zfs.min_auto_ashift 12 vfs.zfs.max_auto_ashift 13 vfs.zfs.vdev.trim_max_pending 10000 vfs.zfs.vdev.bio_delete_disable 0 vfs.zfs.vdev.bio_flush_disable 0 vfs.zfs.vdev.queue_depth_pct 1000 vfs.zfs.vdev.write_gap_limit 4096 vfs.zfs.vdev.read_gap_limit 32768 vfs.zfs.vdev.aggregation_limit 131072 vfs.zfs.vdev.trim_max_active 64 vfs.zfs.vdev.trim_min_active 1 vfs.zfs.vdev.scrub_max_active 2 vfs.zfs.vdev.scrub_min_active 1 vfs.zfs.vdev.async_write_max_active 10 vfs.zfs.vdev.async_write_min_active 1 vfs.zfs.vdev.async_read_max_active 3 vfs.zfs.vdev.async_read_min_active 1 vfs.zfs.vdev.sync_write_max_active 10 vfs.zfs.vdev.sync_write_min_active 10 vfs.zfs.vdev.sync_read_max_active 10 vfs.zfs.vdev.sync_read_min_active 10 vfs.zfs.vdev.max_active 1000 vfs.zfs.vdev.async_write_active_max_dirty_percent60 vfs.zfs.vdev.async_write_active_min_dirty_percent30 vfs.zfs.vdev.mirror.non_rotating_seek_inc1 vfs.zfs.vdev.mirror.non_rotating_inc 0 vfs.zfs.vdev.mirror.rotating_seek_offset1048576 vfs.zfs.vdev.mirror.rotating_seek_inc 5 vfs.zfs.vdev.mirror.rotating_inc 0 vfs.zfs.vdev.trim_on_init 1 vfs.zfs.vdev.cache.bshift 16 vfs.zfs.vdev.cache.size 0 vfs.zfs.vdev.cache.max 16384 vfs.zfs.vdev.default_ms_shift 29 vfs.zfs.vdev.min_ms_count 16 vfs.zfs.vdev.max_ms_count 200 vfs.zfs.txg.timeout 5 vfs.zfs.spa_min_slop 134217728 vfs.zfs.spa_slop_shift 5 vfs.zfs.spa_asize_inflation 24 vfs.zfs.deadman_enabled 0 vfs.zfs.deadman_checktime_ms 5000 vfs.zfs.deadman_synctime_ms 1000000 vfs.zfs.debug_flags 0 vfs.zfs.debugflags 0 vfs.zfs.recover 0 vfs.zfs.spa_load_verify_data 1 vfs.zfs.spa_load_verify_metadata 1 vfs.zfs.spa_load_verify_maxinflight 10000 vfs.zfs.max_missing_tvds_scan 0 vfs.zfs.max_missing_tvds_cachefile 2 vfs.zfs.max_missing_tvds 0 vfs.zfs.spa_load_print_vdev_tree 0 vfs.zfs.ccw_retry_interval 300 vfs.zfs.check_hostid 1 vfs.zfs.mg_fragmentation_threshold 85 vfs.zfs.mg_noalloc_threshold 0 vfs.zfs.condense_pct 200 vfs.zfs.metaslab_sm_blksz 4096 vfs.zfs.metaslab.bias_enabled 1 vfs.zfs.metaslab.lba_weighting_enabled 1 vfs.zfs.metaslab.fragmentation_factor_enabled1 vfs.zfs.metaslab.preload_enabled 1 vfs.zfs.metaslab.preload_limit 3 vfs.zfs.metaslab.unload_delay 8 vfs.zfs.metaslab.load_pct 50 vfs.zfs.metaslab.min_alloc_size 33554432 vfs.zfs.metaslab.df_free_pct 4 vfs.zfs.metaslab.df_alloc_threshold 131072 vfs.zfs.metaslab.debug_unload 0 vfs.zfs.metaslab.debug_load 0 vfs.zfs.metaslab.fragmentation_threshold70 vfs.zfs.metaslab.force_ganging 16777217 vfs.zfs.free_bpobj_enabled 1 vfs.zfs.free_max_blocks -1 vfs.zfs.no_scrub_prefetch 0 vfs.zfs.no_scrub_io 0 vfs.zfs.resilver_min_time_ms 3000 vfs.zfs.free_min_time_ms 1000 vfs.zfs.scan_min_time_ms 1000 vfs.zfs.scan_idle 50 vfs.zfs.scrub_delay 4 vfs.zfs.resilver_delay 2 vfs.zfs.top_maxinflight 32 vfs.zfs.zfetch.array_rd_sz 1048576 vfs.zfs.zfetch.max_idistance 67108864 vfs.zfs.zfetch.max_distance 8388608 vfs.zfs.zfetch.min_sec_reap 2 vfs.zfs.zfetch.max_streams 8 vfs.zfs.prefetch_disable 0 vfs.zfs.delay_scale 500000 vfs.zfs.delay_min_dirty_percent 60 vfs.zfs.dirty_data_sync 67108864 vfs.zfs.dirty_data_max_percent 10 vfs.zfs.dirty_data_max_max 4294967296 vfs.zfs.dirty_data_max 4294967296 vfs.zfs.max_recordsize 1048576 vfs.zfs.default_ibs 17 vfs.zfs.default_bs 9 vfs.zfs.send_holes_without_birth_time 1 vfs.zfs.mdcomp_disable 0 vfs.zfs.per_txg_dirty_frees_percent 30 vfs.zfs.nopwrite_enabled 1 vfs.zfs.dedup.prefetch 1 vfs.zfs.dbuf_cache_lowater_pct 10 vfs.zfs.dbuf_cache_hiwater_pct 10 vfs.zfs.dbuf_cache_shift 5 vfs.zfs.dbuf_cache_max_bytes 3060818688 vfs.zfs.l2c_only_size 0 vfs.zfs.mfu_ghost_data_esize 44354878976 vfs.zfs.mfu_ghost_metadata_esize 8191800320 vfs.zfs.mfu_ghost_size 52546679296 vfs.zfs.mfu_data_esize 0 vfs.zfs.mfu_metadata_esize 84992 vfs.zfs.mfu_size 64779776 vfs.zfs.mru_ghost_data_esize 48840192 vfs.zfs.mru_ghost_metadata_esize 926682624 vfs.zfs.mru_ghost_size 975522816 vfs.zfs.mru_data_esize 47824171520 vfs.zfs.mru_metadata_esize 5086352896 vfs.zfs.mru_size 52926374400 vfs.zfs.anon_data_esize 0 vfs.zfs.anon_metadata_esize 0 vfs.zfs.anon_size 69500928 vfs.zfs.l2arc_norw 1 vfs.zfs.l2arc_feed_again 1 vfs.zfs.l2arc_noprefetch 1 vfs.zfs.l2arc_feed_min_ms 200 vfs.zfs.l2arc_feed_secs 1 vfs.zfs.l2arc_headroom 2 vfs.zfs.l2arc_write_boost 8388608 vfs.zfs.l2arc_write_max 8388608 vfs.zfs.arc_meta_limit 13421772800 vfs.zfs.arc_free_target 167651 vfs.zfs.compressed_arc_enabled 1 vfs.zfs.arc_grow_retry 60 vfs.zfs.arc_shrink_shift 7 vfs.zfs.arc_average_blocksize 8192 vfs.zfs.arc_no_grow_shift 5 vfs.zfs.arc_min 12243274752 vfs.zfs.arc_max 53687091200 vfs.zfs.abd_chunk_size 4096 ------------------------------------------------------------------------ A few iterations of 'zpool iostat -v': capacity operations bandwidth pool alloc free read write read write ------------ ----- ----- ----- ----- ----- ----- mendeleev 5.02T 16.6T 3.58K 0 389M 0 raidz2 5.02T 16.6T 3.58K 0 389M 0 gpt/zfs0 - - 783 0 45.9M 0 gpt/zfs1 - - 1.29K 0 100M 0 gpt/zfs2 - - 711 0 54.9M 0 gpt/zfs3 - - 729 0 54.3M 0 gpt/zfs4 - - 1.23K 0 99.7M 0 gpt/zfs5 - - 752 0 46.3M 0 ------------ ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write ------------ ----- ----- ----- ----- ----- ----- mendeleev 5.02T 16.6T 8.71K 0 286M 0 raidz2 5.02T 16.6T 8.71K 0 286M 0 gpt/zfs0 - - 715 0 35.6M 0 gpt/zfs1 - - 2.15K 0 82.7M 0 gpt/zfs2 - - 791 0 34.4M 0 gpt/zfs3 - - 706 0 33.6M 0 gpt/zfs4 - - 2.29K 0 82.3M 0 gpt/zfs5 - - 693 0 37.0M 0 ------------ ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write ------------ ----- ----- ----- ----- ----- ----- mendeleev 5.02T 16.6T 6.28K 0 775M 0 raidz2 5.02T 16.6T 6.28K 0 775M 0 gpt/zfs0 - - 1.37K 0 112M 0 gpt/zfs1 - - 1.57K 0 194M 0 gpt/zfs2 - - 802 0 87.0M 0 gpt/zfs3 - - 851 0 86.9M 0 gpt/zfs4 - - 2.03K 0 199M 0 gpt/zfs5 - - 1.09K 0 112M 0 ------------ ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write ------------ ----- ----- ----- ----- ----- ----- mendeleev 5.02T 16.6T 9.15K 0 272M 0 raidz2 5.02T 16.6T 9.15K 0 272M 0 gpt/zfs0 - - 507 0 23.1M 0 gpt/zfs1 - - 2.77K 0 80.7M 0 gpt/zfs2 - - 727 0 41.8M 0 gpt/zfs3 - - 707 0 41.2M 0 gpt/zfs4 - - 2.58K 0 76.1M 0 gpt/zfs5 - - 509 0 23.6M 0 ------------ ----- ----- ----- ----- ----- -----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180712214248.GA98578>