Date: Wed, 8 Feb 2012 20:05:24 -0500 From: Charles Sprickman <spork@bway.net> To: Miroslav Lachman <000.fbsd@quip.cz> Cc: "Eugene M. Zheganin" <emz@norma.perm.ru>, freebsd-stable <freebsd-stable@FreeBSD.org>, Andriy Gapon <avg@FreeBSD.org> Subject: Re: zfs arc and amount of wired memory Message-ID: <E251C55F-9A35-457C-B876-E0D1029DD6C0@bway.net> In-Reply-To: <4F330F38.3010806@quip.cz> References: <4F30E284.8080905@norma.perm.ru> <4F310115.3070507@FreeBSD.org> <4F310C5A.6070400@norma.perm.ru> <4F310E75.7090301@FreeBSD.org> <4F3144A9.2000505@norma.perm.ru> <4F314892.50806@FreeBSD.org> <4F314B5B.100@norma.perm.ru> <4F3186C6.8000904@FreeBSD.org> <4F324F10.2060508@norma.perm.ru> <4F32DB30.6020600@FreeBSD.org> <4F330F38.3010806@quip.cz>
next in thread | previous in thread | raw e-mail | index | archive | help
On Feb 8, 2012, at 7:11 PM, Miroslav Lachman wrote: > Andriy Gapon wrote: >> on 08/02/2012 12:31 Eugene M. Zheganin said the following: >>> Hi. >>>=20 >>> On 08.02.2012 02:17, Andriy Gapon wrote: >>>> [output snipped] >>>>=20 >>>> Thank you. I don't see anything suspicious/unusual there. >>>> Just case, do you have ZFS dedup enabled by a chance? >>>>=20 >>>> I think that examination of vmstat -m and vmstat -z outputs may = provide some >>>> clues as to what got all that memory wired. >>>>=20 >>> Nope, I don't have deduplication feature enabled. >>=20 >> OK. So, did you have a chance to inspect vmstat -m and vmstat -z? >>=20 >>> By the way, today, after eating another 100M of wired memory this = server hanged >>> out with multiple non-stopping messages >>>=20 >>> swap_pager: indefinite wait buffer >>>=20 >>> Since it's swapping on zvol, it looks to me like it could be the = mentioned in >>> another thread here ("Swap on zvol - recommendable?") resource = starvation issue; >>> may be it happens faster when the ARC isn't limited. >>=20 >> It could be very well possible that swap on zvol doesn't work well = when the >> kernel itself is starved on memory. >>=20 >>> So I want to ask - how to report it and what should I include in = such pr ? >>=20 >> I am leaving swap-on-zvol issue aside. Your original problem doesn't = seem to be >> ZFS-related. I suspect that you might be running into some kernel = memory leak. >> If you manage to reproduce the high wired value again, then vmstat = -m and >> vmstat -z may provide some useful information. >>=20 >> In this vein, do you use any out-of-tree kernel modules? >> Also, can you try to monitor your system to see when wired count = grows? >=20 > I am seeing something similar on one of our machine. This is old 7.3 = with ZFS v13, that's why I did not reported it. >=20 > The machine is used as storage for backups made by rsync. All is = running fine for about 107 days. Then backups are slower and slower = because of some strange memory situation. >=20 > Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M = Free >=20 > ARC Size: > Current Size: 1769 MB (arcsize) > Target Size (Adaptive): 512 MB (c) > Min Size (Hard Limit): 512 MB (zfs_arc_min) > Max Size (Hard Limit): 3584 MB (zfs_arc_max) >=20 > The target size is going down to the min size and after few more days, = the system is so slow, that I must reboot the machine. Then it is = running fine for about 107 days and then it all repeat again. >=20 > You can see more on MRTG graphs > http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/ > You can see links to other useful informations on top of the page = (arc_summary, top, dmesg, fs usage, loader.conf) >=20 > There you can see nightly backups (higher CPU load started at 01:13), = otherwise the machine is idle. >=20 > It coresponds with ARC target size lowering in last 5 days > = http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcst= ats_size.html >=20 > And with ARC metadata cache overflowing the limit in last 5 days > = http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_m= eta.html I'm not having luck finding it, but there's some known issue that exists = even in 8.2 where some 32-bit counter overflows or something. I don't = truly remember the logic in it, but when you hit it, it's around 110 = days or so. Before it gets really bad (to the point where you either = reboot or get some memory exhaustion panic), you can see zfs "evict = skips" incrementing rapidly. Looking at that graph, that would be my = guess as to what's happening to you. It's easy to check - run one of = the arc stats scripts, look for "evict_skips", note the number and then = run it a few minutes later. If it increases by more than a few hundred, = you've hit the bug. You'll find at that point the kernel is no longer = "evicting" ARC from the kernel and it will just continue to grow until = bad things happen. Charles >=20 > I don't know what's going on and I don't know if it is something know = / fixed in newer releases. We are running a few more ZFS systems on 8.2 = without this issue. But those systems are in different roles. >=20 > Miroslav Lachman > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E251C55F-9A35-457C-B876-E0D1029DD6C0>