Date: Wed, 02 Dec 2015 13:34:28 +0200 From: "Zeus Panchenko" <zeus@ibs.dn.ua> To: "FreeBSD Filesystems" <freebsd-fs@freebsd.org> Subject: advice needed: zpool of 10 x (raidz2 on (4+2) x 2T HDD) Message-ID: <20151202133428.35820@smtp.new-ukraine.org>
next in thread | raw e-mail | index | archive | help
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 greetings, we deployed storage, and as it was filling until now, I see I need an advice regarding the configuration and optimization/s ... the main cause I decided to ask for an advice is this: once per month (or even more frequently, depends on the load I suggest) host hangs and only power reset helps, nothing helpful in log files though ... just the fact of restart logged and usual ctld activity after reboot, `zpool import' lasts 40min and more, and during this time no resource of the host is used much ... neither CPU nor memory ... top and systat shows no load (I need to export pool first since I need to attach geli first, and if I attach geli with zpool still imported, I receive in the end a lot of "absent/damaged" disks in zpool which disappears after export/import) so, I'm wondering what can I do to trace the cause of hangs? what to monito= re to understand what to expect and how to prevent ...=20 so, please, advise =2D -----------------------------------------------------------------------= ----------- bellow the details are: =2D -----------------------------------------------------------------------= ----------- the box is Supermicro X9DRD-7LN4F with: CPU: Intel(R) Xeon(R) CPU E5-2630L (2 package(s) x 6 core(s) x 2 SMT thre= ads) RAM: 128Gb STOR: 3 x LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (jbod) 60 x HDD 2T (ATA WDC WD20EFRX-68A 0A80, Fixed Direct Access SCSI-6 d= evice 600.000MB/s) OS: FreeBSD 10.1-RELEASE #0 r274401 amd64 to avoid OS memory shortage sysctl vfs.zfs.arc_max is set to 120275861504 to clients, storage is provided via iSCSI by ctld (each target is file back= ed) zpool created of 10 x raidz2, each raidz2 consists of 6 geli devices and now looks so (yes, deduplication is on): > zpool list storage NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH = ALTROOT storage 109T 33.5T 75.2T - - 30% 1.57x ONLINE - > zpool history storage 2013-10-21.01:31:14 zpool create storage=20 raidz2 gpt/c0s00 gpt/c0s01 gpt/c1s00 gpt/c1s01 gpt/c2s00 gpt/c2s01 raidz2 gpt/c0s02 gpt/c0s03 gpt/c1s02 gpt/c1s03 gpt/c2s02 gpt/c2s03 ... raidz2 gpt/c0s18 gpt/c0s19 gpt/c1s18 gpt/c1s19 gpt/c2s18 gpt/c2s19 log mirror gpt/log0 gpt/log1 cache gpt/cache0 gpt/cache1 > zdb storage Cached configuration: version: 5000 name: 'storage' state: 0 txg: 13340514 pool_guid: 11994995707440773547 hostid: 1519855013 hostname: 'storage.foo.bar' vdev_children: 11 vdev_tree: type: 'root' id: 0 guid: 11994995707440773547 children[0]: type: 'raidz' id: 0 guid: 12290021428260525074 nparity: 2 metaslab_array: 46 metaslab_shift: 36 ashift: 12 asize: 12002364751872 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 3897093815971447961 path: '/dev/gpt/c0s00' phys_path: '/dev/gpt/c0s00' whole_disk: 1 DTL: 9133 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 1036685341766239763 path: '/dev/gpt/c0s01' phys_path: '/dev/gpt/c0s01' whole_disk: 1 DTL: 9132 create_txg: 4 ... each geli is created on one HDD > geli list da50.eli Geom name: da50.eli State: ACTIVE EncryptionAlgorithm: AES-XTS KeyLength: 256 Crypto: hardware Version: 6 UsedKey: 0 Flags: (null) KeysAllocated: 466 KeysTotal: 466 Providers: 1. Name: da50.eli Mediasize: 2000398929920 (1.8T) Sectorsize: 4096 Mode: r1w1e3 Consumers: 1. Name: da50 Mediasize: 2000398934016 (1.8T) Sectorsize: 512 Stripesize: 4096 Stripeoffset: 0 Mode: r1w1e1 each raidz2 disk configured as: > gpart show da50.eli=20=20=20=20=20 =3D> 6 488378634 da50.eli GPT (1.8T) 6 488378634 1 freebsd-zfs (1.8T) > zfs-stats -a =2D -----------------------------------------------------------------------= --- ZFS Subsystem Report Wed Dec 2 09:59:27 2015 =2D -----------------------------------------------------------------------= --- System Information: Kernel Version: 1001000 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 FreeBSD 10.1-RELEASE #0 r274401: Tue Nov 11 21:02:49 UTC 2014 root 9:59AM up 1 day, 46 mins, 10 users, load averages: 1.03, 0.46, 0.75 =2D -----------------------------------------------------------------------= --- System Memory Statistics: Physical Memory: 131012.88M Kernel Memory: 1915.37M DATA: 98.62% 1888.90M TEXT: 1.38% 26.47M =2D -----------------------------------------------------------------------= --- ZFS pool information: Storage pool Version (spa): 5000 Filesystem Version (zpl): 5 =2D -----------------------------------------------------------------------= --- ARC Misc: Deleted: 1961248 Recycle Misses: 127014 Mutex Misses: 5973 Evict Skips: 5973 ARC Size: Current Size (arcsize): 100.00% 114703.88M Target Size (Adaptive, c): 100.00% 114704.00M Min Size (Hard Limit, c_min): 12.50% 14338.00M Max Size (High Water, c_max): ~8:1 114704.00M ARC Size Breakdown: Recently Used Cache Size (p): 93.75% 107535.69M Freq. Used Cache Size (c-p): 6.25% 7168.31M ARC Hash Breakdown: Elements Max: 6746532 Elements Current: 100.00% 6746313 Collisions: 9651654 Chain Max: 0 Chains: 1050203 ARC Eviction Statistics: Evicts Total: 194298918912 Evicts Eligible for L2: 81.00% 157373345280 Evicts Ineligible for L2: 19.00% 36925573632 Evicts Cached to L2: 97939090944 ARC Efficiency Cache Access Total: 109810376 Cache Hit Ratio: 91.57% 100555148 Cache Miss Ratio: 8.43% 9255228 Actual Hit Ratio: 90.54% 99423922 Data Demand Efficiency: 76.64% Data Prefetch Efficiency: 48.46% CACHE HITS BY CACHE LIST: Anonymously Used: 0.88% 881966 Most Recently Used (mru): 23.11% 23236902 Most Frequently Used (mfu): 75.77% 76187020 MRU Ghost (mru_ghost): 0.03% 26449 MFU Ghost (mfu_ghost): 0.22% 222811 CACHE HITS BY DATA TYPE: Demand Data: 10.17% 10227867 Prefetch Data: 0.45% 455126 Demand Metadata: 88.69% 89184329 Prefetch Metadata: 0.68% 687826 CACHE MISSES BY DATA TYPE: Demand Data: 33.69% 3117808 Prefetch Data: 5.23% 484140 Demand Metadata: 56.55% 5233984 Prefetch Metadata: 4.53% 419296 =2D -----------------------------------------------------------------------= --- L2 ARC Summary: Low Memory Aborts: 77 R/W Clashes: 13 Free on Write: 523 L2 ARC Size: Current Size: (Adaptive) 91988.13M Header Size: 0.13% 120.08M L2 ARC Read/Write Activity: Bytes Written: 97783.99M Bytes Read: 2464.81M L2 ARC Breakdown: Access Total: 8110124 Hit Ratio: 2.89% 234616 Miss Ratio: 97.11% 7875508 Feeds: 85129 WRITES: Sent Total: 100.00% 18448 =2D -----------------------------------------------------------------------= --- VDEV Cache Summary: Access Total: 0 Hits Ratio: 0.00% 0 Miss Ratio: 0.00% 0 Delegations: 0 =2D -----------------------------------------------------------------------= --- File-Level Prefetch Stats (DMU): DMU Efficiency: Access Total: 162279162 Hit Ratio: 91.69% 148788486 Miss Ratio: 8.31% 13490676 Colinear Access Total: 13490676 Colinear Hit Ratio: 0.06% 8166 Colinear Miss Ratio: 99.94% 13482510 Stride Access Total: 146863482 Stride Hit Ratio: 99.31% 145846806 Stride Miss Ratio: 0.69% 1016676 DMU misc: Reclaim successes: 124372 Reclaim failures: 13358138 Stream resets: 618 Stream noresets: 2938602 Bogus streams: 0 =2D -----------------------------------------------------------------------= --- ZFS Tunable (sysctl): kern.maxusers=3D8524 vfs.zfs.arc_max=3D120275861504 vfs.zfs.arc_min=3D15034482688 vfs.zfs.arc_average_blocksize=3D8192 vfs.zfs.arc_meta_used=3D24838283936 vfs.zfs.arc_meta_limit=3D30068965376 vfs.zfs.l2arc_write_max=3D8388608 vfs.zfs.l2arc_write_boost=3D8388608 vfs.zfs.l2arc_headroom=3D2 vfs.zfs.l2arc_feed_secs=3D1 vfs.zfs.l2arc_feed_min_ms=3D200 vfs.zfs.l2arc_noprefetch=3D1 vfs.zfs.l2arc_feed_again=3D1 vfs.zfs.l2arc_norw=3D1 vfs.zfs.anon_size=3D27974656 vfs.zfs.anon_metadata_lsize=3D0 vfs.zfs.anon_data_lsize=3D0 vfs.zfs.mru_size=3D112732930560 vfs.zfs.mru_metadata_lsize=3D18147921408 vfs.zfs.mru_data_lsize=3D92690379776 vfs.zfs.mru_ghost_size=3D7542758400 vfs.zfs.mru_ghost_metadata_lsize=3D1262705664 vfs.zfs.mru_ghost_data_lsize=3D6280052736 vfs.zfs.mfu_size=3D3748620800 vfs.zfs.mfu_metadata_lsize=3D1014886912 vfs.zfs.mfu_data_lsize=3D2723481600 vfs.zfs.mfu_ghost_size=3D24582345728 vfs.zfs.mfu_ghost_metadata_lsize=3D682512384 vfs.zfs.mfu_ghost_data_lsize=3D23899833344 vfs.zfs.l2c_only_size=3D66548531200 vfs.zfs.dedup.prefetch=3D1 vfs.zfs.nopwrite_enabled=3D1 vfs.zfs.mdcomp_disable=3D0 vfs.zfs.dirty_data_max=3D4294967296 vfs.zfs.dirty_data_max_max=3D4294967296 vfs.zfs.dirty_data_max_percent=3D10 vfs.zfs.dirty_data_sync=3D67108864 vfs.zfs.delay_min_dirty_percent=3D60 vfs.zfs.delay_scale=3D500000 vfs.zfs.prefetch_disable=3D0 vfs.zfs.zfetch.max_streams=3D8 vfs.zfs.zfetch.min_sec_reap=3D2 vfs.zfs.zfetch.block_cap=3D256 vfs.zfs.zfetch.array_rd_sz=3D1048576 vfs.zfs.top_maxinflight=3D32 vfs.zfs.resilver_delay=3D2 vfs.zfs.scrub_delay=3D4 vfs.zfs.scan_idle=3D50 vfs.zfs.scan_min_time_ms=3D1000 vfs.zfs.free_min_time_ms=3D1000 vfs.zfs.resilver_min_time_ms=3D3000 vfs.zfs.no_scrub_io=3D0 vfs.zfs.no_scrub_prefetch=3D0 vfs.zfs.metaslab.gang_bang=3D131073 vfs.zfs.metaslab.fragmentation_threshold=3D70 vfs.zfs.metaslab.debug_load=3D0 vfs.zfs.metaslab.debug_unload=3D0 vfs.zfs.metaslab.df_alloc_threshold=3D131072 vfs.zfs.metaslab.df_free_pct=3D4 vfs.zfs.metaslab.min_alloc_size=3D10485760 vfs.zfs.metaslab.load_pct=3D50 vfs.zfs.metaslab.unload_delay=3D8 vfs.zfs.metaslab.preload_limit=3D3 vfs.zfs.metaslab.preload_enabled=3D1 vfs.zfs.metaslab.fragmentation_factor_enabled=3D1 vfs.zfs.metaslab.lba_weighting_enabled=3D1 vfs.zfs.metaslab.bias_enabled=3D1 vfs.zfs.condense_pct=3D200 vfs.zfs.mg_noalloc_threshold=3D0 vfs.zfs.mg_fragmentation_threshold=3D85 vfs.zfs.check_hostid=3D1 vfs.zfs.spa_load_verify_maxinflight=3D10000 vfs.zfs.spa_load_verify_metadata=3D1 vfs.zfs.spa_load_verify_data=3D1 vfs.zfs.recover=3D0 vfs.zfs.deadman_synctime_ms=3D1000000 vfs.zfs.deadman_checktime_ms=3D5000 vfs.zfs.deadman_enabled=3D1 vfs.zfs.spa_asize_inflation=3D24 vfs.zfs.txg.timeout=3D5 vfs.zfs.vdev.cache.max=3D16384 vfs.zfs.vdev.cache.size=3D0 vfs.zfs.vdev.cache.bshift=3D16 vfs.zfs.vdev.trim_on_init=3D1 vfs.zfs.vdev.mirror.rotating_inc=3D0 vfs.zfs.vdev.mirror.rotating_seek_inc=3D5 vfs.zfs.vdev.mirror.rotating_seek_offset=3D1048576 vfs.zfs.vdev.mirror.non_rotating_inc=3D0 vfs.zfs.vdev.mirror.non_rotating_seek_inc=3D1 vfs.zfs.vdev.max_active=3D1000 vfs.zfs.vdev.sync_read_min_active=3D10 vfs.zfs.vdev.sync_read_max_active=3D10 vfs.zfs.vdev.sync_write_min_active=3D10 vfs.zfs.vdev.sync_write_max_active=3D10 vfs.zfs.vdev.async_read_min_active=3D1 vfs.zfs.vdev.async_read_max_active=3D3 vfs.zfs.vdev.async_write_min_active=3D1 vfs.zfs.vdev.async_write_max_active=3D10 vfs.zfs.vdev.scrub_min_active=3D1 vfs.zfs.vdev.scrub_max_active=3D2 vfs.zfs.vdev.trim_min_active=3D1 vfs.zfs.vdev.trim_max_active=3D64 vfs.zfs.vdev.aggregation_limit=3D131072 vfs.zfs.vdev.read_gap_limit=3D32768 vfs.zfs.vdev.write_gap_limit=3D4096 vfs.zfs.vdev.bio_flush_disable=3D0 vfs.zfs.vdev.bio_delete_disable=3D0 vfs.zfs.vdev.trim_max_bytes=3D2147483648 vfs.zfs.vdev.trim_max_pending=3D64 vfs.zfs.max_auto_ashift=3D13 vfs.zfs.min_auto_ashift=3D9 vfs.zfs.zil_replay_disable=3D0 vfs.zfs.cache_flush_disable=3D0 vfs.zfs.zio.use_uma=3D1 vfs.zfs.zio.exclude_metadata=3D0 vfs.zfs.sync_pass_deferred_free=3D2 vfs.zfs.sync_pass_dont_compress=3D5 vfs.zfs.sync_pass_rewrite=3D2 vfs.zfs.snapshot_list_prefetch=3D0 vfs.zfs.super_owner=3D0 vfs.zfs.debug=3D0 vfs.zfs.version.ioctl=3D4 vfs.zfs.version.acl=3D1 vfs.zfs.version.spa=3D5000 vfs.zfs.version.zpl=3D5 vfs.zfs.vol.mode=3D1 vfs.zfs.trim.enabled=3D1 vfs.zfs.trim.txg_delay=3D32 vfs.zfs.trim.timeout=3D30 vfs.zfs.trim.max_interval=3D1 vm.kmem_size=3D133823901696 vm.kmem_size_scale=3D1 vm.kmem_size_min=3D0 vm.kmem_size_max=3D1319413950874 =2D --=20 Zeus V. Panchenko jid:zeus@im.ibs.dn.ua IT Dpt., I.B.S. LLC GMT+2 (EET) =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlZe10QACgkQr3jpPg/3oyqVAwCdHeRra+H9ac/+HCiQ80DhthlZ SSUAnjucvvosNjcUzTqKgGe+LlLctaoV =3DWPge =2D----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151202133428.35820>