Date: Wed, 02 Dec 2015 13:34:28 +0200 From: "Zeus Panchenko" <zeus@ibs.dn.ua> To: "FreeBSD Filesystems" <freebsd-fs@freebsd.org> Subject: advice needed: zpool of 10 x (raidz2 on (4+2) x 2T HDD) Message-ID: <20151202133428.35820@smtp.new-ukraine.org>
next in thread | raw e-mail | index | archive | help
=2D----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
greetings,
we deployed storage, and as it was filling until now, I see I need
an advice regarding the configuration and optimization/s ...
the main cause I decided to ask for an advice is this:
once per month (or even more frequently, depends on the load I
suggest) host hangs and only power reset helps, nothing helpful in log
files though ... just the fact of restart logged and usual ctld activity
after reboot, `zpool import' lasts 40min and more, and during this time
no resource of the host is used much ... neither CPU nor memory ... top
and systat shows no load (I need to export pool first since I need to
attach geli first, and if I attach geli with zpool still imported, I
receive in the end a lot of "absent/damaged" disks in zpool which
disappears after export/import)
so, I'm wondering what can I do to trace the cause of hangs? what to monito=
re to
understand what to expect and how to prevent ...=20
so, please, advise
=2D -----------------------------------------------------------------------=
-----------
bellow the details are:
=2D -----------------------------------------------------------------------=
-----------
the box is Supermicro X9DRD-7LN4F with:
CPU: Intel(R) Xeon(R) CPU E5-2630L (2 package(s) x 6 core(s) x 2 SMT thre=
ads)
RAM: 128Gb
STOR: 3 x LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (jbod)
60 x HDD 2T (ATA WDC WD20EFRX-68A 0A80, Fixed Direct Access SCSI-6 d=
evice 600.000MB/s)
OS: FreeBSD 10.1-RELEASE #0 r274401 amd64
to avoid OS memory shortage sysctl vfs.zfs.arc_max is set to 120275861504
to clients, storage is provided via iSCSI by ctld (each target is file back=
ed)
zpool created of 10 x raidz2, each raidz2 consists of 6 geli devices and
now looks so (yes, deduplication is on):
> zpool list storage
NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH =
ALTROOT
storage 109T 33.5T 75.2T - - 30% 1.57x ONLINE -
> zpool history storage
2013-10-21.01:31:14 zpool create storage=20
raidz2 gpt/c0s00 gpt/c0s01 gpt/c1s00 gpt/c1s01 gpt/c2s00 gpt/c2s01
raidz2 gpt/c0s02 gpt/c0s03 gpt/c1s02 gpt/c1s03 gpt/c2s02 gpt/c2s03
...
raidz2 gpt/c0s18 gpt/c0s19 gpt/c1s18 gpt/c1s19 gpt/c2s18 gpt/c2s19
log mirror gpt/log0 gpt/log1
cache gpt/cache0 gpt/cache1
> zdb storage
Cached configuration:
version: 5000
name: 'storage'
state: 0
txg: 13340514
pool_guid: 11994995707440773547
hostid: 1519855013
hostname: 'storage.foo.bar'
vdev_children: 11
vdev_tree:
type: 'root'
id: 0
guid: 11994995707440773547
children[0]:
type: 'raidz'
id: 0
guid: 12290021428260525074
nparity: 2
metaslab_array: 46
metaslab_shift: 36
ashift: 12
asize: 12002364751872
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 3897093815971447961
path: '/dev/gpt/c0s00'
phys_path: '/dev/gpt/c0s00'
whole_disk: 1
DTL: 9133
create_txg: 4
children[1]:
type: 'disk'
id: 1
guid: 1036685341766239763
path: '/dev/gpt/c0s01'
phys_path: '/dev/gpt/c0s01'
whole_disk: 1
DTL: 9132
create_txg: 4
...
each geli is created on one HDD
> geli list da50.eli
Geom name: da50.eli
State: ACTIVE
EncryptionAlgorithm: AES-XTS
KeyLength: 256
Crypto: hardware
Version: 6
UsedKey: 0
Flags: (null)
KeysAllocated: 466
KeysTotal: 466
Providers:
1. Name: da50.eli
Mediasize: 2000398929920 (1.8T)
Sectorsize: 4096
Mode: r1w1e3
Consumers:
1. Name: da50
Mediasize: 2000398934016 (1.8T)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r1w1e1
each raidz2 disk configured as:
> gpart show da50.eli=20=20=20=20=20
=3D> 6 488378634 da50.eli GPT (1.8T)
6 488378634 1 freebsd-zfs (1.8T)
> zfs-stats -a
=2D -----------------------------------------------------------------------=
---
ZFS Subsystem Report Wed Dec 2 09:59:27 2015
=2D -----------------------------------------------------------------------=
---
System Information:
Kernel Version: 1001000 (osreldate)
Hardware Platform: amd64
Processor Architecture: amd64
FreeBSD 10.1-RELEASE #0 r274401: Tue Nov 11 21:02:49 UTC 2014 root
9:59AM up 1 day, 46 mins, 10 users, load averages: 1.03, 0.46, 0.75
=2D -----------------------------------------------------------------------=
---
System Memory Statistics:
Physical Memory: 131012.88M
Kernel Memory: 1915.37M
DATA: 98.62% 1888.90M
TEXT: 1.38% 26.47M
=2D -----------------------------------------------------------------------=
---
ZFS pool information:
Storage pool Version (spa): 5000
Filesystem Version (zpl): 5
=2D -----------------------------------------------------------------------=
---
ARC Misc:
Deleted: 1961248
Recycle Misses: 127014
Mutex Misses: 5973
Evict Skips: 5973
ARC Size:
Current Size (arcsize): 100.00% 114703.88M
Target Size (Adaptive, c): 100.00% 114704.00M
Min Size (Hard Limit, c_min): 12.50% 14338.00M
Max Size (High Water, c_max): ~8:1 114704.00M
ARC Size Breakdown:
Recently Used Cache Size (p): 93.75% 107535.69M
Freq. Used Cache Size (c-p): 6.25% 7168.31M
ARC Hash Breakdown:
Elements Max: 6746532
Elements Current: 100.00% 6746313
Collisions: 9651654
Chain Max: 0
Chains: 1050203
ARC Eviction Statistics:
Evicts Total: 194298918912
Evicts Eligible for L2: 81.00% 157373345280
Evicts Ineligible for L2: 19.00% 36925573632
Evicts Cached to L2: 97939090944
ARC Efficiency
Cache Access Total: 109810376
Cache Hit Ratio: 91.57% 100555148
Cache Miss Ratio: 8.43% 9255228
Actual Hit Ratio: 90.54% 99423922
Data Demand Efficiency: 76.64%
Data Prefetch Efficiency: 48.46%
CACHE HITS BY CACHE LIST:
Anonymously Used: 0.88% 881966
Most Recently Used (mru): 23.11% 23236902
Most Frequently Used (mfu): 75.77% 76187020
MRU Ghost (mru_ghost): 0.03% 26449
MFU Ghost (mfu_ghost): 0.22% 222811
CACHE HITS BY DATA TYPE:
Demand Data: 10.17% 10227867
Prefetch Data: 0.45% 455126
Demand Metadata: 88.69% 89184329
Prefetch Metadata: 0.68% 687826
CACHE MISSES BY DATA TYPE:
Demand Data: 33.69% 3117808
Prefetch Data: 5.23% 484140
Demand Metadata: 56.55% 5233984
Prefetch Metadata: 4.53% 419296
=2D -----------------------------------------------------------------------=
---
L2 ARC Summary:
Low Memory Aborts: 77
R/W Clashes: 13
Free on Write: 523
L2 ARC Size:
Current Size: (Adaptive) 91988.13M
Header Size: 0.13% 120.08M
L2 ARC Read/Write Activity:
Bytes Written: 97783.99M
Bytes Read: 2464.81M
L2 ARC Breakdown:
Access Total: 8110124
Hit Ratio: 2.89% 234616
Miss Ratio: 97.11% 7875508
Feeds: 85129
WRITES:
Sent Total: 100.00% 18448
=2D -----------------------------------------------------------------------=
---
VDEV Cache Summary:
Access Total: 0
Hits Ratio: 0.00% 0
Miss Ratio: 0.00% 0
Delegations: 0
=2D -----------------------------------------------------------------------=
---
File-Level Prefetch Stats (DMU):
DMU Efficiency:
Access Total: 162279162
Hit Ratio: 91.69% 148788486
Miss Ratio: 8.31% 13490676
Colinear Access Total: 13490676
Colinear Hit Ratio: 0.06% 8166
Colinear Miss Ratio: 99.94% 13482510
Stride Access Total: 146863482
Stride Hit Ratio: 99.31% 145846806
Stride Miss Ratio: 0.69% 1016676
DMU misc:
Reclaim successes: 124372
Reclaim failures: 13358138
Stream resets: 618
Stream noresets: 2938602
Bogus streams: 0
=2D -----------------------------------------------------------------------=
---
ZFS Tunable (sysctl):
kern.maxusers=3D8524
vfs.zfs.arc_max=3D120275861504
vfs.zfs.arc_min=3D15034482688
vfs.zfs.arc_average_blocksize=3D8192
vfs.zfs.arc_meta_used=3D24838283936
vfs.zfs.arc_meta_limit=3D30068965376
vfs.zfs.l2arc_write_max=3D8388608
vfs.zfs.l2arc_write_boost=3D8388608
vfs.zfs.l2arc_headroom=3D2
vfs.zfs.l2arc_feed_secs=3D1
vfs.zfs.l2arc_feed_min_ms=3D200
vfs.zfs.l2arc_noprefetch=3D1
vfs.zfs.l2arc_feed_again=3D1
vfs.zfs.l2arc_norw=3D1
vfs.zfs.anon_size=3D27974656
vfs.zfs.anon_metadata_lsize=3D0
vfs.zfs.anon_data_lsize=3D0
vfs.zfs.mru_size=3D112732930560
vfs.zfs.mru_metadata_lsize=3D18147921408
vfs.zfs.mru_data_lsize=3D92690379776
vfs.zfs.mru_ghost_size=3D7542758400
vfs.zfs.mru_ghost_metadata_lsize=3D1262705664
vfs.zfs.mru_ghost_data_lsize=3D6280052736
vfs.zfs.mfu_size=3D3748620800
vfs.zfs.mfu_metadata_lsize=3D1014886912
vfs.zfs.mfu_data_lsize=3D2723481600
vfs.zfs.mfu_ghost_size=3D24582345728
vfs.zfs.mfu_ghost_metadata_lsize=3D682512384
vfs.zfs.mfu_ghost_data_lsize=3D23899833344
vfs.zfs.l2c_only_size=3D66548531200
vfs.zfs.dedup.prefetch=3D1
vfs.zfs.nopwrite_enabled=3D1
vfs.zfs.mdcomp_disable=3D0
vfs.zfs.dirty_data_max=3D4294967296
vfs.zfs.dirty_data_max_max=3D4294967296
vfs.zfs.dirty_data_max_percent=3D10
vfs.zfs.dirty_data_sync=3D67108864
vfs.zfs.delay_min_dirty_percent=3D60
vfs.zfs.delay_scale=3D500000
vfs.zfs.prefetch_disable=3D0
vfs.zfs.zfetch.max_streams=3D8
vfs.zfs.zfetch.min_sec_reap=3D2
vfs.zfs.zfetch.block_cap=3D256
vfs.zfs.zfetch.array_rd_sz=3D1048576
vfs.zfs.top_maxinflight=3D32
vfs.zfs.resilver_delay=3D2
vfs.zfs.scrub_delay=3D4
vfs.zfs.scan_idle=3D50
vfs.zfs.scan_min_time_ms=3D1000
vfs.zfs.free_min_time_ms=3D1000
vfs.zfs.resilver_min_time_ms=3D3000
vfs.zfs.no_scrub_io=3D0
vfs.zfs.no_scrub_prefetch=3D0
vfs.zfs.metaslab.gang_bang=3D131073
vfs.zfs.metaslab.fragmentation_threshold=3D70
vfs.zfs.metaslab.debug_load=3D0
vfs.zfs.metaslab.debug_unload=3D0
vfs.zfs.metaslab.df_alloc_threshold=3D131072
vfs.zfs.metaslab.df_free_pct=3D4
vfs.zfs.metaslab.min_alloc_size=3D10485760
vfs.zfs.metaslab.load_pct=3D50
vfs.zfs.metaslab.unload_delay=3D8
vfs.zfs.metaslab.preload_limit=3D3
vfs.zfs.metaslab.preload_enabled=3D1
vfs.zfs.metaslab.fragmentation_factor_enabled=3D1
vfs.zfs.metaslab.lba_weighting_enabled=3D1
vfs.zfs.metaslab.bias_enabled=3D1
vfs.zfs.condense_pct=3D200
vfs.zfs.mg_noalloc_threshold=3D0
vfs.zfs.mg_fragmentation_threshold=3D85
vfs.zfs.check_hostid=3D1
vfs.zfs.spa_load_verify_maxinflight=3D10000
vfs.zfs.spa_load_verify_metadata=3D1
vfs.zfs.spa_load_verify_data=3D1
vfs.zfs.recover=3D0
vfs.zfs.deadman_synctime_ms=3D1000000
vfs.zfs.deadman_checktime_ms=3D5000
vfs.zfs.deadman_enabled=3D1
vfs.zfs.spa_asize_inflation=3D24
vfs.zfs.txg.timeout=3D5
vfs.zfs.vdev.cache.max=3D16384
vfs.zfs.vdev.cache.size=3D0
vfs.zfs.vdev.cache.bshift=3D16
vfs.zfs.vdev.trim_on_init=3D1
vfs.zfs.vdev.mirror.rotating_inc=3D0
vfs.zfs.vdev.mirror.rotating_seek_inc=3D5
vfs.zfs.vdev.mirror.rotating_seek_offset=3D1048576
vfs.zfs.vdev.mirror.non_rotating_inc=3D0
vfs.zfs.vdev.mirror.non_rotating_seek_inc=3D1
vfs.zfs.vdev.max_active=3D1000
vfs.zfs.vdev.sync_read_min_active=3D10
vfs.zfs.vdev.sync_read_max_active=3D10
vfs.zfs.vdev.sync_write_min_active=3D10
vfs.zfs.vdev.sync_write_max_active=3D10
vfs.zfs.vdev.async_read_min_active=3D1
vfs.zfs.vdev.async_read_max_active=3D3
vfs.zfs.vdev.async_write_min_active=3D1
vfs.zfs.vdev.async_write_max_active=3D10
vfs.zfs.vdev.scrub_min_active=3D1
vfs.zfs.vdev.scrub_max_active=3D2
vfs.zfs.vdev.trim_min_active=3D1
vfs.zfs.vdev.trim_max_active=3D64
vfs.zfs.vdev.aggregation_limit=3D131072
vfs.zfs.vdev.read_gap_limit=3D32768
vfs.zfs.vdev.write_gap_limit=3D4096
vfs.zfs.vdev.bio_flush_disable=3D0
vfs.zfs.vdev.bio_delete_disable=3D0
vfs.zfs.vdev.trim_max_bytes=3D2147483648
vfs.zfs.vdev.trim_max_pending=3D64
vfs.zfs.max_auto_ashift=3D13
vfs.zfs.min_auto_ashift=3D9
vfs.zfs.zil_replay_disable=3D0
vfs.zfs.cache_flush_disable=3D0
vfs.zfs.zio.use_uma=3D1
vfs.zfs.zio.exclude_metadata=3D0
vfs.zfs.sync_pass_deferred_free=3D2
vfs.zfs.sync_pass_dont_compress=3D5
vfs.zfs.sync_pass_rewrite=3D2
vfs.zfs.snapshot_list_prefetch=3D0
vfs.zfs.super_owner=3D0
vfs.zfs.debug=3D0
vfs.zfs.version.ioctl=3D4
vfs.zfs.version.acl=3D1
vfs.zfs.version.spa=3D5000
vfs.zfs.version.zpl=3D5
vfs.zfs.vol.mode=3D1
vfs.zfs.trim.enabled=3D1
vfs.zfs.trim.txg_delay=3D32
vfs.zfs.trim.timeout=3D30
vfs.zfs.trim.max_interval=3D1
vm.kmem_size=3D133823901696
vm.kmem_size_scale=3D1
vm.kmem_size_min=3D0
vm.kmem_size_max=3D1319413950874
=2D --=20
Zeus V. Panchenko jid:zeus@im.ibs.dn.ua
IT Dpt., I.B.S. LLC GMT+2 (EET)
=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEARECAAYFAlZe10QACgkQr3jpPg/3oyqVAwCdHeRra+H9ac/+HCiQ80DhthlZ
SSUAnjucvvosNjcUzTqKgGe+LlLctaoV
=3DWPge
=2D----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151202133428.35820>
