Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Mar 2015 18:30:57 -0500
From:      Dustin Wenz <dustinwenz@ebureau.com>
To:        Karl Denninger <karl@denninger.net>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: All available memory used when deleting files from ZFS
Message-ID:  <923828D6-503B-4FC3-89E8-1DC6DF0C9B6B@ebureau.com>
In-Reply-To: <5519C329.3090001@denninger.net>
References:  <FD30147A-C7F7-4138-9F96-10024A6FE061@ebureau.com> <5519C329.3090001@denninger.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Unfortunately, I just spent the day recovering from this, so I have no =
way to easily get new memory stats now. I'm planning on doing a test =
with additional data in an effort to understand more about the issue, =
but it will take time to set something up.

In the meantime, I'd advise anyone running ZFS on FreeBSD 10.x to be =
mindful when freeing up lots of space all at once.

	- .Dustin

> On Mar 30, 2015, at 4:42 PM, Karl Denninger <karl@denninger.net> =
wrote:
>=20
> What's the UMA memory use look like on that machine when the remove is
> initiated and progresses?  Look with vmstat -z and see what the used =
and
> free counts look like for the zio allocations......
>=20
> On 3/30/2015 4:14 PM, Dustin Wenz wrote:
>> I had several systems panic or hang over the weekend while deleting =
some data off of their local zfs filesystem. It looks like they ran out =
of physical memory (32GB), and hung when paging to swap-on-zfs (which is =
not surprising, given that ZFS was likely using the memory). They were =
running 10.1-STABLE r277139M, which I built in the middle of January. =
The pools were about 35TB in size, and are a concatenation of 3TB =
mirrors. They were maybe 95% full. I deleted just over 1000 files, =
totaling 25TB on each system.
>>=20
>> It took roughly 10 minutes to remove that 25TB of data per host using =
a remote rsync, and immediately after that everything seemed fine. =
However, after several more minutes, every machine that had data removed =
became unresponsive. Some had numerous "swap_pager: indefinite wait =
buffer" errors followed by a panic, and some just died with no console =
messages. The same thing would happen after a reboot, when FreeBSD =
attempted to mount the local filesystem again.
>>=20
>> I was able to boot these systems after exporting the affected pool, =
but the problem would recur several minutes after initiating a "zpool =
import". Watching zfs statistics didn't seem to reveal where the memory =
was going; ARC would only climb to about 4GB, but free memory would =
decline rapidly. Eventually, after enough export/reboot/import cycles, =
the pool would import successfully and everything would be fine from =
then on. Note that there is no L2ARC or compression being used.
>>=20
>> Has anyone else run into this when deleting files on ZFS? It seems to =
be a consistent problem under the versions of 10.1 I'm running.
>>=20
>> For reference, I've appended a zstat dump below that was taken 5 =
minutes after starting a zpool import, and was about three minutes =
before the machine became unresponsive. You can see that the ARC is only =
4GB, but free memory was down to 471MB (and continued to drop).
>>=20
>> 	- .Dustin
>>=20
>>=20
>> =
------------------------------------------------------------------------
>> ZFS Subsystem Report				Mon Mar 30 12:35:27 2015
>> =
------------------------------------------------------------------------
>>=20
>> System Information:
>>=20
>> 	Kernel Version:				1001506 (osreldate)
>> 	Hardware Platform:			amd64
>> 	Processor Architecture:			amd64
>>=20
>> 	ZFS Storage pool Version:		5000
>> 	ZFS Filesystem Version:			5
>>=20
>> FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root
>> 12:35PM  up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87
>>=20
>> =
------------------------------------------------------------------------
>>=20
>> System Memory:
>>=20
>> 	0.17%	55.40	MiB Active,	0.14%	46.11	MiB Inact
>> 	98.34%	30.56	GiB Wired,	0.00%	0 Cache
>> 	1.34%	425.46	MiB Free,	0.00%	4.00	KiB Gap
>>=20
>> 	Real Installed:				32.00	GiB
>> 	Real Available:			99.82%	31.94	GiB
>> 	Real Managed:			97.29%	31.08	GiB
>>=20
>> 	Logical Total:				32.00	GiB
>> 	Logical Used:			98.56%	31.54	GiB
>> 	Logical Free:			1.44%	471.57	MiB
>>=20
>> Kernel Memory:					3.17	GiB
>> 	Data:				99.18%	3.14	GiB
>> 	Text:				0.82%	26.68	MiB
>>=20
>> Kernel Memory Map:				31.08	GiB
>> 	Size:				14.18%	4.41	GiB
>> 	Free:				85.82%	26.67	GiB
>>=20
>> =
------------------------------------------------------------------------
>>=20
>> ARC Summary: (HEALTHY)
>> 	Memory Throttle Count:			0
>>=20
>> ARC Misc:
>> 	Deleted:				145
>> 	Recycle Misses:				0
>> 	Mutex Misses:				0
>> 	Evict Skips:				0
>>=20
>> ARC Size:				14.17%	4.26	GiB
>> 	Target Size: (Adaptive)		100.00%	30.08	GiB
>> 	Min Size (Hard Limit):		12.50%	3.76	GiB
>> 	Max Size (High Water):		8:1	30.08	GiB
>>=20
>> ARC Size Breakdown:
>> 	Recently Used Cache Size:	50.00%	15.04	GiB
>> 	Frequently Used Cache Size:	50.00%	15.04	GiB
>>=20
>> ARC Hash Breakdown:
>> 	Elements Max:				270.56k
>> 	Elements Current:		100.00%	270.56k
>> 	Collisions:				23.66k
>> 	Chain Max:				3
>> 	Chains:					8.28k
>>=20
>> =
------------------------------------------------------------------------
>>=20
>> ARC Efficiency:					2.93m
>> 	Cache Hit Ratio:		70.44%	2.06m
>> 	Cache Miss Ratio:		29.56%	866.05k
>> 	Actual Hit Ratio:		70.40%	2.06m
>>=20
>> 	Data Demand Efficiency:		97.47%	24.58k
>> 	Data Prefetch Efficiency:	1.88%	479
>>=20
>> 	CACHE HITS BY CACHE LIST:
>> 	  Anonymously Used:		0.05%	1.07k
>> 	  Most Recently Used:		71.82%	1.48m
>> 	  Most Frequently Used:		28.13%	580.49k
>> 	  Most Recently Used Ghost:	0.00%	0
>> 	  Most Frequently Used Ghost:	0.00%	0
>>=20
>> 	CACHE HITS BY DATA TYPE:
>> 	  Demand Data:			1.16%	23.96k
>> 	  Prefetch Data:		0.00%	9
>> 	  Demand Metadata:		98.79%	2.04m
>> 	  Prefetch Metadata:		0.05%	1.08k
>>=20
>> 	CACHE MISSES BY DATA TYPE:
>> 	  Demand Data:			0.07%	621
>> 	  Prefetch Data:		0.05%	470
>> 	  Demand Metadata:		99.69%	863.35k
>> 	  Prefetch Metadata:		0.19%	1.61k
>>=20
>> =
------------------------------------------------------------------------
>>=20
>> L2ARC is disabled
>>=20
>> =
------------------------------------------------------------------------
>>=20
>> File-Level Prefetch: (HEALTHY)
>>=20
>> DMU Efficiency:					72.95k
>> 	Hit Ratio:			70.83%	51.66k
>> 	Miss Ratio:			29.17%	21.28k
>>=20
>> 	Colinear:				21.28k
>> 	  Hit Ratio:			0.01%	2
>> 	  Miss Ratio:			99.99%	21.28k
>>=20
>> 	Stride:					50.45k
>> 	  Hit Ratio:			99.98%	50.44k
>> 	  Miss Ratio:			0.02%	9
>>=20
>> DMU Misc:
>> 	Reclaim:				21.28k
>> 	  Successes:			1.73%	368
>> 	  Failures:			98.27%	20.91k
>>=20
>> 	Streams:				1.23k
>> 	  +Resets:			0.16%	2
>> 	  -Resets:			99.84%	1.23k
>> 	  Bogus:				0
>>=20
>> =
------------------------------------------------------------------------
>>=20
>> VDEV cache is disabled
>>=20
>> =
------------------------------------------------------------------------
>>=20
>> ZFS Tunables (sysctl):
>> 	kern.maxusers                           2380
>> 	vm.kmem_size                            33367830528
>> 	vm.kmem_size_scale                      1
>> 	vm.kmem_size_min                        0
>> 	vm.kmem_size_max                        1319413950874
>> 	vfs.zfs.arc_max                         32294088704
>> 	vfs.zfs.arc_min                         4036761088
>> 	vfs.zfs.arc_average_blocksize           8192
>> 	vfs.zfs.arc_shrink_shift                5
>> 	vfs.zfs.arc_free_target                 56518
>> 	vfs.zfs.arc_meta_used                   4534349216
>> 	vfs.zfs.arc_meta_limit                  8073522176
>> 	vfs.zfs.l2arc_write_max                 8388608
>> 	vfs.zfs.l2arc_write_boost               8388608
>> 	vfs.zfs.l2arc_headroom                  2
>> 	vfs.zfs.l2arc_feed_secs                 1
>> 	vfs.zfs.l2arc_feed_min_ms               200
>> 	vfs.zfs.l2arc_noprefetch                1
>> 	vfs.zfs.l2arc_feed_again                1
>> 	vfs.zfs.l2arc_norw                      1
>> 	vfs.zfs.anon_size                       1786368
>> 	vfs.zfs.anon_metadata_lsize             0
>> 	vfs.zfs.anon_data_lsize                 0
>> 	vfs.zfs.mru_size                        504812032
>> 	vfs.zfs.mru_metadata_lsize              415273472
>> 	vfs.zfs.mru_data_lsize                  35227648
>> 	vfs.zfs.mru_ghost_size                  0
>> 	vfs.zfs.mru_ghost_metadata_lsize        0
>> 	vfs.zfs.mru_ghost_data_lsize            0
>> 	vfs.zfs.mfu_size                        3925990912
>> 	vfs.zfs.mfu_metadata_lsize              3901947392
>> 	vfs.zfs.mfu_data_lsize                  7000064
>> 	vfs.zfs.mfu_ghost_size                  0
>> 	vfs.zfs.mfu_ghost_metadata_lsize        0
>> 	vfs.zfs.mfu_ghost_data_lsize            0
>> 	vfs.zfs.l2c_only_size                   0
>> 	vfs.zfs.dedup.prefetch                  1
>> 	vfs.zfs.nopwrite_enabled                1
>> 	vfs.zfs.mdcomp_disable                  0
>> 	vfs.zfs.max_recordsize                  1048576
>> 	vfs.zfs.dirty_data_max                  3429735628
>> 	vfs.zfs.dirty_data_max_max              4294967296
>> 	vfs.zfs.dirty_data_max_percent          10
>> 	vfs.zfs.dirty_data_sync                 67108864
>> 	vfs.zfs.delay_min_dirty_percent         60
>> 	vfs.zfs.delay_scale                     500000
>> 	vfs.zfs.prefetch_disable                0
>> 	vfs.zfs.zfetch.max_streams              8
>> 	vfs.zfs.zfetch.min_sec_reap             2
>> 	vfs.zfs.zfetch.block_cap                256
>> 	vfs.zfs.zfetch.array_rd_sz              1048576
>> 	vfs.zfs.top_maxinflight                 32
>> 	vfs.zfs.resilver_delay                  2
>> 	vfs.zfs.scrub_delay                     4
>> 	vfs.zfs.scan_idle                       50
>> 	vfs.zfs.scan_min_time_ms                1000
>> 	vfs.zfs.free_min_time_ms                1000
>> 	vfs.zfs.resilver_min_time_ms            3000
>> 	vfs.zfs.no_scrub_io                     0
>> 	vfs.zfs.no_scrub_prefetch               0
>> 	vfs.zfs.free_max_blocks                 -1
>> 	vfs.zfs.metaslab.gang_bang              16777217
>> 	vfs.zfs.metaslab.fragmentation_threshold70
>> 	vfs.zfs.metaslab.debug_load             0
>> 	vfs.zfs.metaslab.debug_unload           0
>> 	vfs.zfs.metaslab.df_alloc_threshold     131072
>> 	vfs.zfs.metaslab.df_free_pct            4
>> 	vfs.zfs.metaslab.min_alloc_size         33554432
>> 	vfs.zfs.metaslab.load_pct               50
>> 	vfs.zfs.metaslab.unload_delay           8
>> 	vfs.zfs.metaslab.preload_limit          3
>> 	vfs.zfs.metaslab.preload_enabled        1
>> 	vfs.zfs.metaslab.fragmentation_factor_enabled1
>> 	vfs.zfs.metaslab.lba_weighting_enabled  1
>> 	vfs.zfs.metaslab.bias_enabled           1
>> 	vfs.zfs.condense_pct                    200
>> 	vfs.zfs.mg_noalloc_threshold            0
>> 	vfs.zfs.mg_fragmentation_threshold      85
>> 	vfs.zfs.check_hostid                    1
>> 	vfs.zfs.spa_load_verify_maxinflight     10000
>> 	vfs.zfs.spa_load_verify_metadata        1
>> 	vfs.zfs.spa_load_verify_data            1
>> 	vfs.zfs.recover                         0
>> 	vfs.zfs.deadman_synctime_ms             1000000
>> 	vfs.zfs.deadman_checktime_ms            5000
>> 	vfs.zfs.deadman_enabled                 1
>> 	vfs.zfs.spa_asize_inflation             24
>> 	vfs.zfs.spa_slop_shift                  5
>> 	vfs.zfs.space_map_blksz                 4096
>> 	vfs.zfs.txg.timeout                     5
>> 	vfs.zfs.vdev.metaslabs_per_vdev         200
>> 	vfs.zfs.vdev.cache.max                  16384
>> 	vfs.zfs.vdev.cache.size                 0
>> 	vfs.zfs.vdev.cache.bshift               16
>> 	vfs.zfs.vdev.trim_on_init               1
>> 	vfs.zfs.vdev.mirror.rotating_inc        0
>> 	vfs.zfs.vdev.mirror.rotating_seek_inc   5
>> 	vfs.zfs.vdev.mirror.rotating_seek_offset1048576
>> 	vfs.zfs.vdev.mirror.non_rotating_inc    0
>> 	vfs.zfs.vdev.mirror.non_rotating_seek_inc1
>> 	vfs.zfs.vdev.async_write_active_min_dirty_percent30
>> 	vfs.zfs.vdev.async_write_active_max_dirty_percent60
>> 	vfs.zfs.vdev.max_active                 1000
>> 	vfs.zfs.vdev.sync_read_min_active       10
>> 	vfs.zfs.vdev.sync_read_max_active       10
>> 	vfs.zfs.vdev.sync_write_min_active      10
>> 	vfs.zfs.vdev.sync_write_max_active      10
>> 	vfs.zfs.vdev.async_read_min_active      1
>> 	vfs.zfs.vdev.async_read_max_active      3
>> 	vfs.zfs.vdev.async_write_min_active     1
>> 	vfs.zfs.vdev.async_write_max_active     10
>> 	vfs.zfs.vdev.scrub_min_active           1
>> 	vfs.zfs.vdev.scrub_max_active           2
>> 	vfs.zfs.vdev.trim_min_active            1
>> 	vfs.zfs.vdev.trim_max_active            64
>> 	vfs.zfs.vdev.aggregation_limit          131072
>> 	vfs.zfs.vdev.read_gap_limit             32768
>> 	vfs.zfs.vdev.write_gap_limit            4096
>> 	vfs.zfs.vdev.bio_flush_disable          0
>> 	vfs.zfs.vdev.bio_delete_disable         0
>> 	vfs.zfs.vdev.trim_max_bytes             2147483648
>> 	vfs.zfs.vdev.trim_max_pending           64
>> 	vfs.zfs.max_auto_ashift                 13
>> 	vfs.zfs.min_auto_ashift                 9
>> 	vfs.zfs.zil_replay_disable              0
>> 	vfs.zfs.cache_flush_disable             0
>> 	vfs.zfs.zio.use_uma                     1
>> 	vfs.zfs.zio.exclude_metadata            0
>> 	vfs.zfs.sync_pass_deferred_free         2
>> 	vfs.zfs.sync_pass_dont_compress         5
>> 	vfs.zfs.sync_pass_rewrite               2
>> 	vfs.zfs.snapshot_list_prefetch          0
>> 	vfs.zfs.super_owner                     0
>> 	vfs.zfs.debug                           0
>> 	vfs.zfs.version.ioctl                   4
>> 	vfs.zfs.version.acl                     1
>> 	vfs.zfs.version.spa                     5000
>> 	vfs.zfs.version.zpl                     5
>> 	vfs.zfs.vol.mode                        1
>> 	vfs.zfs.vol.unmap_enabled               1
>> 	vfs.zfs.trim.enabled                    1
>> 	vfs.zfs.trim.txg_delay                  32
>> 	vfs.zfs.trim.timeout                    30
>> 	vfs.zfs.trim.max_interval               1
>>=20
>> =
------------------------------------------------------------------------
>>=20
>>=20
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>>=20
>>=20
>> %SPAMBLOCK-SYS: Matched [@freebsd.org+], message ok
>=20
> --=20
> Karl Denninger
> karl@denninger.net
> /The Market Ticker/
>=20
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?923828D6-503B-4FC3-89E8-1DC6DF0C9B6B>