Date: Tue, 31 Mar 2015 16:52:33 -0500 From: Dustin Wenz <dustinwenz@ebureau.com> To: "<freebsd-fs@freebsd.org>" <freebsd-fs@freebsd.org> Subject: Re: All available memory used when deleting files from ZFS Message-ID: <712A53CA-7A54-420F-9721-592A39D9A717@ebureau.com> In-Reply-To: <923828D6-503B-4FC3-89E8-1DC6DF0C9B6B@ebureau.com> References: <FD30147A-C7F7-4138-9F96-10024A6FE061@ebureau.com> <5519C329.3090001@denninger.net> <923828D6-503B-4FC3-89E8-1DC6DF0C9B6B@ebureau.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I was able to do a little regression testing on this, since I still had = about 10 hosts remaining on FreeBSD 9.2. They are the same hardware and = disk configuration, and had the same data files and zpool configurations = (mirrors of 3TB mechanical disks) as the machines that blew up over the = weekend. The only difference is that they ran FreeBSD 9.2.=20 Using the same rsync procedure as before, I was able to delete the 25 TB = of data on all of the remaining hosts with no issues whatsoever. I never = saw any free memory reduction (if anything, it increased since ARC was = being freed up as well), no paging and no hangs. One other difference = was that it took about twice as long to delete the files on 9.2 as on = 10.1 (20 minutes instead of 10 minutes). So, it would appear that there is some different zfs behavior in FreeBSD = 10.1 that was not present in 9.2, and it's causing problems when freeing = up space. If I knew why it takes twice as long to delete files in 9.2, = that might shed some light on this. There is also the recent background = destroy feature that might be suspect, but I'm not destroying = filesystems here. What other recent zfs changes might apply to deleting = files? - .Dustin > On Mar 30, 2015, at 6:30 PM, Dustin Wenz <DustinWenz@ebureau.com> = wrote: >=20 > Unfortunately, I just spent the day recovering from this, so I have no = way to easily get new memory stats now. I'm planning on doing a test = with additional data in an effort to understand more about the issue, = but it will take time to set something up. >=20 > In the meantime, I'd advise anyone running ZFS on FreeBSD 10.x to be = mindful when freeing up lots of space all at once. >=20 > - .Dustin >=20 >> On Mar 30, 2015, at 4:42 PM, Karl Denninger <karl@denninger.net> = wrote: >>=20 >> What's the UMA memory use look like on that machine when the remove = is >> initiated and progresses? Look with vmstat -z and see what the used = and >> free counts look like for the zio allocations...... >>=20 >> On 3/30/2015 4:14 PM, Dustin Wenz wrote: >>> I had several systems panic or hang over the weekend while deleting = some data off of their local zfs filesystem. It looks like they ran out = of physical memory (32GB), and hung when paging to swap-on-zfs (which is = not surprising, given that ZFS was likely using the memory). They were = running 10.1-STABLE r277139M, which I built in the middle of January. = The pools were about 35TB in size, and are a concatenation of 3TB = mirrors. They were maybe 95% full. I deleted just over 1000 files, = totaling 25TB on each system. >>>=20 >>> It took roughly 10 minutes to remove that 25TB of data per host = using a remote rsync, and immediately after that everything seemed fine. = However, after several more minutes, every machine that had data removed = became unresponsive. Some had numerous "swap_pager: indefinite wait = buffer" errors followed by a panic, and some just died with no console = messages. The same thing would happen after a reboot, when FreeBSD = attempted to mount the local filesystem again. >>>=20 >>> I was able to boot these systems after exporting the affected pool, = but the problem would recur several minutes after initiating a "zpool = import". Watching zfs statistics didn't seem to reveal where the memory = was going; ARC would only climb to about 4GB, but free memory would = decline rapidly. Eventually, after enough export/reboot/import cycles, = the pool would import successfully and everything would be fine from = then on. Note that there is no L2ARC or compression being used. >>>=20 >>> Has anyone else run into this when deleting files on ZFS? It seems = to be a consistent problem under the versions of 10.1 I'm running. >>>=20 >>> For reference, I've appended a zstat dump below that was taken 5 = minutes after starting a zpool import, and was about three minutes = before the machine became unresponsive. You can see that the ARC is only = 4GB, but free memory was down to 471MB (and continued to drop). >>>=20 >>> - .Dustin >>>=20 >>>=20 >>> = ------------------------------------------------------------------------ >>> ZFS Subsystem Report Mon Mar 30 = 12:35:27 2015 >>> = ------------------------------------------------------------------------ >>>=20 >>> System Information: >>>=20 >>> Kernel Version: 1001506 (osreldate) >>> Hardware Platform: amd64 >>> Processor Architecture: amd64 >>>=20 >>> ZFS Storage pool Version: 5000 >>> ZFS Filesystem Version: 5 >>>=20 >>> FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root >>> 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> System Memory: >>>=20 >>> 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact >>> 98.34% 30.56 GiB Wired, 0.00% 0 Cache >>> 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap >>>=20 >>> Real Installed: 32.00 GiB >>> Real Available: 99.82% 31.94 GiB >>> Real Managed: 97.29% 31.08 GiB >>>=20 >>> Logical Total: 32.00 GiB >>> Logical Used: 98.56% 31.54 GiB >>> Logical Free: 1.44% 471.57 MiB >>>=20 >>> Kernel Memory: 3.17 GiB >>> Data: 99.18% 3.14 GiB >>> Text: 0.82% 26.68 MiB >>>=20 >>> Kernel Memory Map: 31.08 GiB >>> Size: 14.18% 4.41 GiB >>> Free: 85.82% 26.67 GiB >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> ARC Summary: (HEALTHY) >>> Memory Throttle Count: 0 >>>=20 >>> ARC Misc: >>> Deleted: 145 >>> Recycle Misses: 0 >>> Mutex Misses: 0 >>> Evict Skips: 0 >>>=20 >>> ARC Size: 14.17% 4.26 GiB >>> Target Size: (Adaptive) 100.00% 30.08 GiB >>> Min Size (Hard Limit): 12.50% 3.76 GiB >>> Max Size (High Water): 8:1 30.08 GiB >>>=20 >>> ARC Size Breakdown: >>> Recently Used Cache Size: 50.00% 15.04 GiB >>> Frequently Used Cache Size: 50.00% 15.04 GiB >>>=20 >>> ARC Hash Breakdown: >>> Elements Max: 270.56k >>> Elements Current: 100.00% 270.56k >>> Collisions: 23.66k >>> Chain Max: 3 >>> Chains: 8.28k >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> ARC Efficiency: 2.93m >>> Cache Hit Ratio: 70.44% 2.06m >>> Cache Miss Ratio: 29.56% 866.05k >>> Actual Hit Ratio: 70.40% 2.06m >>>=20 >>> Data Demand Efficiency: 97.47% 24.58k >>> Data Prefetch Efficiency: 1.88% 479 >>>=20 >>> CACHE HITS BY CACHE LIST: >>> Anonymously Used: 0.05% 1.07k >>> Most Recently Used: 71.82% 1.48m >>> Most Frequently Used: 28.13% 580.49k >>> Most Recently Used Ghost: 0.00% 0 >>> Most Frequently Used Ghost: 0.00% 0 >>>=20 >>> CACHE HITS BY DATA TYPE: >>> Demand Data: 1.16% 23.96k >>> Prefetch Data: 0.00% 9 >>> Demand Metadata: 98.79% 2.04m >>> Prefetch Metadata: 0.05% 1.08k >>>=20 >>> CACHE MISSES BY DATA TYPE: >>> Demand Data: 0.07% 621 >>> Prefetch Data: 0.05% 470 >>> Demand Metadata: 99.69% 863.35k >>> Prefetch Metadata: 0.19% 1.61k >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> L2ARC is disabled >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> File-Level Prefetch: (HEALTHY) >>>=20 >>> DMU Efficiency: 72.95k >>> Hit Ratio: 70.83% 51.66k >>> Miss Ratio: 29.17% 21.28k >>>=20 >>> Colinear: 21.28k >>> Hit Ratio: 0.01% 2 >>> Miss Ratio: 99.99% 21.28k >>>=20 >>> Stride: 50.45k >>> Hit Ratio: 99.98% 50.44k >>> Miss Ratio: 0.02% 9 >>>=20 >>> DMU Misc: >>> Reclaim: 21.28k >>> Successes: 1.73% 368 >>> Failures: 98.27% 20.91k >>>=20 >>> Streams: 1.23k >>> +Resets: 0.16% 2 >>> -Resets: 99.84% 1.23k >>> Bogus: 0 >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> VDEV cache is disabled >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> ZFS Tunables (sysctl): >>> kern.maxusers 2380 >>> vm.kmem_size 33367830528 >>> vm.kmem_size_scale 1 >>> vm.kmem_size_min 0 >>> vm.kmem_size_max 1319413950874 >>> vfs.zfs.arc_max 32294088704 >>> vfs.zfs.arc_min 4036761088 >>> vfs.zfs.arc_average_blocksize 8192 >>> vfs.zfs.arc_shrink_shift 5 >>> vfs.zfs.arc_free_target 56518 >>> vfs.zfs.arc_meta_used 4534349216 >>> vfs.zfs.arc_meta_limit 8073522176 >>> vfs.zfs.l2arc_write_max 8388608 >>> vfs.zfs.l2arc_write_boost 8388608 >>> vfs.zfs.l2arc_headroom 2 >>> vfs.zfs.l2arc_feed_secs 1 >>> vfs.zfs.l2arc_feed_min_ms 200 >>> vfs.zfs.l2arc_noprefetch 1 >>> vfs.zfs.l2arc_feed_again 1 >>> vfs.zfs.l2arc_norw 1 >>> vfs.zfs.anon_size 1786368 >>> vfs.zfs.anon_metadata_lsize 0 >>> vfs.zfs.anon_data_lsize 0 >>> vfs.zfs.mru_size 504812032 >>> vfs.zfs.mru_metadata_lsize 415273472 >>> vfs.zfs.mru_data_lsize 35227648 >>> vfs.zfs.mru_ghost_size 0 >>> vfs.zfs.mru_ghost_metadata_lsize 0 >>> vfs.zfs.mru_ghost_data_lsize 0 >>> vfs.zfs.mfu_size 3925990912 >>> vfs.zfs.mfu_metadata_lsize 3901947392 >>> vfs.zfs.mfu_data_lsize 7000064 >>> vfs.zfs.mfu_ghost_size 0 >>> vfs.zfs.mfu_ghost_metadata_lsize 0 >>> vfs.zfs.mfu_ghost_data_lsize 0 >>> vfs.zfs.l2c_only_size 0 >>> vfs.zfs.dedup.prefetch 1 >>> vfs.zfs.nopwrite_enabled 1 >>> vfs.zfs.mdcomp_disable 0 >>> vfs.zfs.max_recordsize 1048576 >>> vfs.zfs.dirty_data_max 3429735628 >>> vfs.zfs.dirty_data_max_max 4294967296 >>> vfs.zfs.dirty_data_max_percent 10 >>> vfs.zfs.dirty_data_sync 67108864 >>> vfs.zfs.delay_min_dirty_percent 60 >>> vfs.zfs.delay_scale 500000 >>> vfs.zfs.prefetch_disable 0 >>> vfs.zfs.zfetch.max_streams 8 >>> vfs.zfs.zfetch.min_sec_reap 2 >>> vfs.zfs.zfetch.block_cap 256 >>> vfs.zfs.zfetch.array_rd_sz 1048576 >>> vfs.zfs.top_maxinflight 32 >>> vfs.zfs.resilver_delay 2 >>> vfs.zfs.scrub_delay 4 >>> vfs.zfs.scan_idle 50 >>> vfs.zfs.scan_min_time_ms 1000 >>> vfs.zfs.free_min_time_ms 1000 >>> vfs.zfs.resilver_min_time_ms 3000 >>> vfs.zfs.no_scrub_io 0 >>> vfs.zfs.no_scrub_prefetch 0 >>> vfs.zfs.free_max_blocks -1 >>> vfs.zfs.metaslab.gang_bang 16777217 >>> vfs.zfs.metaslab.fragmentation_threshold70 >>> vfs.zfs.metaslab.debug_load 0 >>> vfs.zfs.metaslab.debug_unload 0 >>> vfs.zfs.metaslab.df_alloc_threshold 131072 >>> vfs.zfs.metaslab.df_free_pct 4 >>> vfs.zfs.metaslab.min_alloc_size 33554432 >>> vfs.zfs.metaslab.load_pct 50 >>> vfs.zfs.metaslab.unload_delay 8 >>> vfs.zfs.metaslab.preload_limit 3 >>> vfs.zfs.metaslab.preload_enabled 1 >>> vfs.zfs.metaslab.fragmentation_factor_enabled1 >>> vfs.zfs.metaslab.lba_weighting_enabled 1 >>> vfs.zfs.metaslab.bias_enabled 1 >>> vfs.zfs.condense_pct 200 >>> vfs.zfs.mg_noalloc_threshold 0 >>> vfs.zfs.mg_fragmentation_threshold 85 >>> vfs.zfs.check_hostid 1 >>> vfs.zfs.spa_load_verify_maxinflight 10000 >>> vfs.zfs.spa_load_verify_metadata 1 >>> vfs.zfs.spa_load_verify_data 1 >>> vfs.zfs.recover 0 >>> vfs.zfs.deadman_synctime_ms 1000000 >>> vfs.zfs.deadman_checktime_ms 5000 >>> vfs.zfs.deadman_enabled 1 >>> vfs.zfs.spa_asize_inflation 24 >>> vfs.zfs.spa_slop_shift 5 >>> vfs.zfs.space_map_blksz 4096 >>> vfs.zfs.txg.timeout 5 >>> vfs.zfs.vdev.metaslabs_per_vdev 200 >>> vfs.zfs.vdev.cache.max 16384 >>> vfs.zfs.vdev.cache.size 0 >>> vfs.zfs.vdev.cache.bshift 16 >>> vfs.zfs.vdev.trim_on_init 1 >>> vfs.zfs.vdev.mirror.rotating_inc 0 >>> vfs.zfs.vdev.mirror.rotating_seek_inc 5 >>> vfs.zfs.vdev.mirror.rotating_seek_offset1048576 >>> vfs.zfs.vdev.mirror.non_rotating_inc 0 >>> vfs.zfs.vdev.mirror.non_rotating_seek_inc1 >>> vfs.zfs.vdev.async_write_active_min_dirty_percent30 >>> vfs.zfs.vdev.async_write_active_max_dirty_percent60 >>> vfs.zfs.vdev.max_active 1000 >>> vfs.zfs.vdev.sync_read_min_active 10 >>> vfs.zfs.vdev.sync_read_max_active 10 >>> vfs.zfs.vdev.sync_write_min_active 10 >>> vfs.zfs.vdev.sync_write_max_active 10 >>> vfs.zfs.vdev.async_read_min_active 1 >>> vfs.zfs.vdev.async_read_max_active 3 >>> vfs.zfs.vdev.async_write_min_active 1 >>> vfs.zfs.vdev.async_write_max_active 10 >>> vfs.zfs.vdev.scrub_min_active 1 >>> vfs.zfs.vdev.scrub_max_active 2 >>> vfs.zfs.vdev.trim_min_active 1 >>> vfs.zfs.vdev.trim_max_active 64 >>> vfs.zfs.vdev.aggregation_limit 131072 >>> vfs.zfs.vdev.read_gap_limit 32768 >>> vfs.zfs.vdev.write_gap_limit 4096 >>> vfs.zfs.vdev.bio_flush_disable 0 >>> vfs.zfs.vdev.bio_delete_disable 0 >>> vfs.zfs.vdev.trim_max_bytes 2147483648 >>> vfs.zfs.vdev.trim_max_pending 64 >>> vfs.zfs.max_auto_ashift 13 >>> vfs.zfs.min_auto_ashift 9 >>> vfs.zfs.zil_replay_disable 0 >>> vfs.zfs.cache_flush_disable 0 >>> vfs.zfs.zio.use_uma 1 >>> vfs.zfs.zio.exclude_metadata 0 >>> vfs.zfs.sync_pass_deferred_free 2 >>> vfs.zfs.sync_pass_dont_compress 5 >>> vfs.zfs.sync_pass_rewrite 2 >>> vfs.zfs.snapshot_list_prefetch 0 >>> vfs.zfs.super_owner 0 >>> vfs.zfs.debug 0 >>> vfs.zfs.version.ioctl 4 >>> vfs.zfs.version.acl 1 >>> vfs.zfs.version.spa 5000 >>> vfs.zfs.version.zpl 5 >>> vfs.zfs.vol.mode 1 >>> vfs.zfs.vol.unmap_enabled 1 >>> vfs.zfs.trim.enabled 1 >>> vfs.zfs.trim.txg_delay 32 >>> vfs.zfs.trim.timeout 30 >>> vfs.zfs.trim.max_interval 1 >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>>=20 >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to = "freebsd-fs-unsubscribe@freebsd.org" >>>=20 >>>=20 >>> %SPAMBLOCK-SYS: Matched [@freebsd.org+], message ok >>=20 >> --=20 >> Karl Denninger >> karl@denninger.net >> /The Market Ticker/ >>=20 >>=20 >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?712A53CA-7A54-420F-9721-592A39D9A717>