Date: Wed, 2 Dec 2015 12:45:24 +0100 From: InterNetX - Juergen Gotteswinter <jg@internetx.com> To: Zeus Panchenko <zeus@ibs.dn.ua>, FreeBSD Filesystems <freebsd-fs@freebsd.org> Subject: Re: advice needed: zpool of 10 x (raidz2 on (4+2) x 2T HDD) Message-ID: <565ED9D4.5050202@internetx.com> In-Reply-To: <20151202133428.35820@smtp.new-ukraine.org> References: <20151202133428.35820@smtp.new-ukraine.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, 2 things i whould consider suspicious. probably 3 SATA Disks on SAS Controller, Dedup and probably the HBA Firmware Version Am 02.12.2015 um 12:34 schrieb Zeus Panchenko: > greetings, > > we deployed storage, and as it was filling until now, I see I need > an advice regarding the configuration and optimization/s ... > > the main cause I decided to ask for an advice is this: > > once per month (or even more frequently, depends on the load I > suggest) host hangs and only power reset helps, nothing helpful in log > files though ... just the fact of restart logged and usual ctld activity > > after reboot, `zpool import' lasts 40min and more, and during this time > no resource of the host is used much ... neither CPU nor memory ... top > and systat shows no load (I need to export pool first since I need to > attach geli first, and if I attach geli with zpool still imported, I > receive in the end a lot of "absent/damaged" disks in zpool which > disappears after export/import) > > > so, I'm wondering what can I do to trace the cause of hangs? what to monitore to > understand what to expect and how to prevent ... > > > so, please, advise > > > > ---------------------------------------------------------------------------------- > bellow the details are: > ---------------------------------------------------------------------------------- > > the box is Supermicro X9DRD-7LN4F with: > > CPU: Intel(R) Xeon(R) CPU E5-2630L (2 package(s) x 6 core(s) x 2 SMT threads) > RAM: 128Gb > STOR: 3 x LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (jbod) > 60 x HDD 2T (ATA WDC WD20EFRX-68A 0A80, Fixed Direct Access SCSI-6 device 600.000MB/s) > > OS: FreeBSD 10.1-RELEASE #0 r274401 amd64 > > to avoid OS memory shortage sysctl vfs.zfs.arc_max is set to 120275861504 > > to clients, storage is provided via iSCSI by ctld (each target is file backed) > > zpool created of 10 x raidz2, each raidz2 consists of 6 geli devices and > now looks so (yes, deduplication is on): > >> zpool list storage > NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH ALTROOT > storage 109T 33.5T 75.2T - - 30% 1.57x ONLINE - > > >> zpool history storage > 2013-10-21.01:31:14 zpool create storage > raidz2 gpt/c0s00 gpt/c0s01 gpt/c1s00 gpt/c1s01 gpt/c2s00 gpt/c2s01 > raidz2 gpt/c0s02 gpt/c0s03 gpt/c1s02 gpt/c1s03 gpt/c2s02 gpt/c2s03 > ... > raidz2 gpt/c0s18 gpt/c0s19 gpt/c1s18 gpt/c1s19 gpt/c2s18 gpt/c2s19 > log mirror gpt/log0 gpt/log1 > cache gpt/cache0 gpt/cache1 > > >> zdb storage > Cached configuration: > version: 5000 > name: 'storage' > state: 0 > txg: 13340514 > pool_guid: 11994995707440773547 > hostid: 1519855013 > hostname: 'storage.foo.bar' > vdev_children: 11 > vdev_tree: > type: 'root' > id: 0 > guid: 11994995707440773547 > children[0]: > type: 'raidz' > id: 0 > guid: 12290021428260525074 > nparity: 2 > metaslab_array: 46 > metaslab_shift: 36 > ashift: 12 > asize: 12002364751872 > is_log: 0 > create_txg: 4 > children[0]: > type: 'disk' > id: 0 > guid: 3897093815971447961 > path: '/dev/gpt/c0s00' > phys_path: '/dev/gpt/c0s00' > whole_disk: 1 > DTL: 9133 > create_txg: 4 > children[1]: > type: 'disk' > id: 1 > guid: 1036685341766239763 > path: '/dev/gpt/c0s01' > phys_path: '/dev/gpt/c0s01' > whole_disk: 1 > DTL: 9132 > create_txg: 4 > ... > > > each geli is created on one HDD >> geli list da50.eli > Geom name: da50.eli > State: ACTIVE > EncryptionAlgorithm: AES-XTS > KeyLength: 256 > Crypto: hardware > Version: 6 > UsedKey: 0 > Flags: (null) > KeysAllocated: 466 > KeysTotal: 466 > Providers: > 1. Name: da50.eli > Mediasize: 2000398929920 (1.8T) > Sectorsize: 4096 > Mode: r1w1e3 > Consumers: > 1. Name: da50 > Mediasize: 2000398934016 (1.8T) > Sectorsize: 512 > Stripesize: 4096 > Stripeoffset: 0 > Mode: r1w1e1 > > > > each raidz2 disk configured as: >> gpart show da50.eli > => 6 488378634 da50.eli GPT (1.8T) > 6 488378634 1 freebsd-zfs (1.8T) > > >> zfs-stats -a > -------------------------------------------------------------------------- > ZFS Subsystem Report Wed Dec 2 09:59:27 2015 > -------------------------------------------------------------------------- > System Information: > > Kernel Version: 1001000 (osreldate) > Hardware Platform: amd64 > Processor Architecture: amd64 > > FreeBSD 10.1-RELEASE #0 r274401: Tue Nov 11 21:02:49 UTC 2014 root > 9:59AM up 1 day, 46 mins, 10 users, load averages: 1.03, 0.46, 0.75 > -------------------------------------------------------------------------- > System Memory Statistics: > Physical Memory: 131012.88M > Kernel Memory: 1915.37M > DATA: 98.62% 1888.90M > TEXT: 1.38% 26.47M > -------------------------------------------------------------------------- > ZFS pool information: > Storage pool Version (spa): 5000 > Filesystem Version (zpl): 5 > -------------------------------------------------------------------------- > ARC Misc: > Deleted: 1961248 > Recycle Misses: 127014 > Mutex Misses: 5973 > Evict Skips: 5973 > > ARC Size: > Current Size (arcsize): 100.00% 114703.88M > Target Size (Adaptive, c): 100.00% 114704.00M > Min Size (Hard Limit, c_min): 12.50% 14338.00M > Max Size (High Water, c_max): ~8:1 114704.00M > > ARC Size Breakdown: > Recently Used Cache Size (p): 93.75% 107535.69M > Freq. Used Cache Size (c-p): 6.25% 7168.31M > > ARC Hash Breakdown: > Elements Max: 6746532 > Elements Current: 100.00% 6746313 > Collisions: 9651654 > Chain Max: 0 > Chains: 1050203 > > ARC Eviction Statistics: > Evicts Total: 194298918912 > Evicts Eligible for L2: 81.00% 157373345280 > Evicts Ineligible for L2: 19.00% 36925573632 > Evicts Cached to L2: 97939090944 > > ARC Efficiency > Cache Access Total: 109810376 > Cache Hit Ratio: 91.57% 100555148 > Cache Miss Ratio: 8.43% 9255228 > Actual Hit Ratio: 90.54% 99423922 > > Data Demand Efficiency: 76.64% > Data Prefetch Efficiency: 48.46% > > CACHE HITS BY CACHE LIST: > Anonymously Used: 0.88% 881966 > Most Recently Used (mru): 23.11% 23236902 > Most Frequently Used (mfu): 75.77% 76187020 > MRU Ghost (mru_ghost): 0.03% 26449 > MFU Ghost (mfu_ghost): 0.22% 222811 > > CACHE HITS BY DATA TYPE: > Demand Data: 10.17% 10227867 > Prefetch Data: 0.45% 455126 > Demand Metadata: 88.69% 89184329 > Prefetch Metadata: 0.68% 687826 > > CACHE MISSES BY DATA TYPE: > Demand Data: 33.69% 3117808 > Prefetch Data: 5.23% 484140 > Demand Metadata: 56.55% 5233984 > Prefetch Metadata: 4.53% 419296 > -------------------------------------------------------------------------- > L2 ARC Summary: > Low Memory Aborts: 77 > R/W Clashes: 13 > Free on Write: 523 > > L2 ARC Size: > Current Size: (Adaptive) 91988.13M > Header Size: 0.13% 120.08M > > L2 ARC Read/Write Activity: > Bytes Written: 97783.99M > Bytes Read: 2464.81M > > L2 ARC Breakdown: > Access Total: 8110124 > Hit Ratio: 2.89% 234616 > Miss Ratio: 97.11% 7875508 > Feeds: 85129 > > WRITES: > Sent Total: 100.00% 18448 > -------------------------------------------------------------------------- > VDEV Cache Summary: > Access Total: 0 > Hits Ratio: 0.00% 0 > Miss Ratio: 0.00% 0 > Delegations: 0 > -------------------------------------------------------------------------- > File-Level Prefetch Stats (DMU): > > DMU Efficiency: > Access Total: 162279162 > Hit Ratio: 91.69% 148788486 > Miss Ratio: 8.31% 13490676 > > Colinear Access Total: 13490676 > Colinear Hit Ratio: 0.06% 8166 > Colinear Miss Ratio: 99.94% 13482510 > > Stride Access Total: 146863482 > Stride Hit Ratio: 99.31% 145846806 > Stride Miss Ratio: 0.69% 1016676 > > DMU misc: > Reclaim successes: 124372 > Reclaim failures: 13358138 > Stream resets: 618 > Stream noresets: 2938602 > Bogus streams: 0 > -------------------------------------------------------------------------- > ZFS Tunable (sysctl): > kern.maxusers=8524 > vfs.zfs.arc_max=120275861504 > vfs.zfs.arc_min=15034482688 > vfs.zfs.arc_average_blocksize=8192 > vfs.zfs.arc_meta_used=24838283936 > vfs.zfs.arc_meta_limit=30068965376 > vfs.zfs.l2arc_write_max=8388608 > vfs.zfs.l2arc_write_boost=8388608 > vfs.zfs.l2arc_headroom=2 > vfs.zfs.l2arc_feed_secs=1 > vfs.zfs.l2arc_feed_min_ms=200 > vfs.zfs.l2arc_noprefetch=1 > vfs.zfs.l2arc_feed_again=1 > vfs.zfs.l2arc_norw=1 > vfs.zfs.anon_size=27974656 > vfs.zfs.anon_metadata_lsize=0 > vfs.zfs.anon_data_lsize=0 > vfs.zfs.mru_size=112732930560 > vfs.zfs.mru_metadata_lsize=18147921408 > vfs.zfs.mru_data_lsize=92690379776 > vfs.zfs.mru_ghost_size=7542758400 > vfs.zfs.mru_ghost_metadata_lsize=1262705664 > vfs.zfs.mru_ghost_data_lsize=6280052736 > vfs.zfs.mfu_size=3748620800 > vfs.zfs.mfu_metadata_lsize=1014886912 > vfs.zfs.mfu_data_lsize=2723481600 > vfs.zfs.mfu_ghost_size=24582345728 > vfs.zfs.mfu_ghost_metadata_lsize=682512384 > vfs.zfs.mfu_ghost_data_lsize=23899833344 > vfs.zfs.l2c_only_size=66548531200 > vfs.zfs.dedup.prefetch=1 > vfs.zfs.nopwrite_enabled=1 > vfs.zfs.mdcomp_disable=0 > vfs.zfs.dirty_data_max=4294967296 > vfs.zfs.dirty_data_max_max=4294967296 > vfs.zfs.dirty_data_max_percent=10 > vfs.zfs.dirty_data_sync=67108864 > vfs.zfs.delay_min_dirty_percent=60 > vfs.zfs.delay_scale=500000 > vfs.zfs.prefetch_disable=0 > vfs.zfs.zfetch.max_streams=8 > vfs.zfs.zfetch.min_sec_reap=2 > vfs.zfs.zfetch.block_cap=256 > vfs.zfs.zfetch.array_rd_sz=1048576 > vfs.zfs.top_maxinflight=32 > vfs.zfs.resilver_delay=2 > vfs.zfs.scrub_delay=4 > vfs.zfs.scan_idle=50 > vfs.zfs.scan_min_time_ms=1000 > vfs.zfs.free_min_time_ms=1000 > vfs.zfs.resilver_min_time_ms=3000 > vfs.zfs.no_scrub_io=0 > vfs.zfs.no_scrub_prefetch=0 > vfs.zfs.metaslab.gang_bang=131073 > vfs.zfs.metaslab.fragmentation_threshold=70 > vfs.zfs.metaslab.debug_load=0 > vfs.zfs.metaslab.debug_unload=0 > vfs.zfs.metaslab.df_alloc_threshold=131072 > vfs.zfs.metaslab.df_free_pct=4 > vfs.zfs.metaslab.min_alloc_size=10485760 > vfs.zfs.metaslab.load_pct=50 > vfs.zfs.metaslab.unload_delay=8 > vfs.zfs.metaslab.preload_limit=3 > vfs.zfs.metaslab.preload_enabled=1 > vfs.zfs.metaslab.fragmentation_factor_enabled=1 > vfs.zfs.metaslab.lba_weighting_enabled=1 > vfs.zfs.metaslab.bias_enabled=1 > vfs.zfs.condense_pct=200 > vfs.zfs.mg_noalloc_threshold=0 > vfs.zfs.mg_fragmentation_threshold=85 > vfs.zfs.check_hostid=1 > vfs.zfs.spa_load_verify_maxinflight=10000 > vfs.zfs.spa_load_verify_metadata=1 > vfs.zfs.spa_load_verify_data=1 > vfs.zfs.recover=0 > vfs.zfs.deadman_synctime_ms=1000000 > vfs.zfs.deadman_checktime_ms=5000 > vfs.zfs.deadman_enabled=1 > vfs.zfs.spa_asize_inflation=24 > vfs.zfs.txg.timeout=5 > vfs.zfs.vdev.cache.max=16384 > vfs.zfs.vdev.cache.size=0 > vfs.zfs.vdev.cache.bshift=16 > vfs.zfs.vdev.trim_on_init=1 > vfs.zfs.vdev.mirror.rotating_inc=0 > vfs.zfs.vdev.mirror.rotating_seek_inc=5 > vfs.zfs.vdev.mirror.rotating_seek_offset=1048576 > vfs.zfs.vdev.mirror.non_rotating_inc=0 > vfs.zfs.vdev.mirror.non_rotating_seek_inc=1 > vfs.zfs.vdev.max_active=1000 > vfs.zfs.vdev.sync_read_min_active=10 > vfs.zfs.vdev.sync_read_max_active=10 > vfs.zfs.vdev.sync_write_min_active=10 > vfs.zfs.vdev.sync_write_max_active=10 > vfs.zfs.vdev.async_read_min_active=1 > vfs.zfs.vdev.async_read_max_active=3 > vfs.zfs.vdev.async_write_min_active=1 > vfs.zfs.vdev.async_write_max_active=10 > vfs.zfs.vdev.scrub_min_active=1 > vfs.zfs.vdev.scrub_max_active=2 > vfs.zfs.vdev.trim_min_active=1 > vfs.zfs.vdev.trim_max_active=64 > vfs.zfs.vdev.aggregation_limit=131072 > vfs.zfs.vdev.read_gap_limit=32768 > vfs.zfs.vdev.write_gap_limit=4096 > vfs.zfs.vdev.bio_flush_disable=0 > vfs.zfs.vdev.bio_delete_disable=0 > vfs.zfs.vdev.trim_max_bytes=2147483648 > vfs.zfs.vdev.trim_max_pending=64 > vfs.zfs.max_auto_ashift=13 > vfs.zfs.min_auto_ashift=9 > vfs.zfs.zil_replay_disable=0 > vfs.zfs.cache_flush_disable=0 > vfs.zfs.zio.use_uma=1 > vfs.zfs.zio.exclude_metadata=0 > vfs.zfs.sync_pass_deferred_free=2 > vfs.zfs.sync_pass_dont_compress=5 > vfs.zfs.sync_pass_rewrite=2 > vfs.zfs.snapshot_list_prefetch=0 > vfs.zfs.super_owner=0 > vfs.zfs.debug=0 > vfs.zfs.version.ioctl=4 > vfs.zfs.version.acl=1 > vfs.zfs.version.spa=5000 > vfs.zfs.version.zpl=5 > vfs.zfs.vol.mode=1 > vfs.zfs.trim.enabled=1 > vfs.zfs.trim.txg_delay=32 > vfs.zfs.trim.timeout=30 > vfs.zfs.trim.max_interval=1 > vm.kmem_size=133823901696 > vm.kmem_size_scale=1 > vm.kmem_size_min=0 > vm.kmem_size_max=1319413950874 > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?565ED9D4.5050202>