Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Feb 2012 20:05:24 -0500
From:      Charles Sprickman <spork@bway.net>
To:        Miroslav Lachman <000.fbsd@quip.cz>
Cc:        "Eugene M. Zheganin" <emz@norma.perm.ru>, freebsd-stable <freebsd-stable@FreeBSD.org>, Andriy Gapon <avg@FreeBSD.org>
Subject:   Re: zfs arc and amount of wired memory
Message-ID:  <E251C55F-9A35-457C-B876-E0D1029DD6C0@bway.net>
In-Reply-To: <4F330F38.3010806@quip.cz>
References:  <4F30E284.8080905@norma.perm.ru> <4F310115.3070507@FreeBSD.org>	<4F310C5A.6070400@norma.perm.ru> <4F310E75.7090301@FreeBSD.org>	<4F3144A9.2000505@norma.perm.ru> <4F314892.50806@FreeBSD.org>	<4F314B5B.100@norma.perm.ru> <4F3186C6.8000904@FreeBSD.org>	<4F324F10.2060508@norma.perm.ru> <4F32DB30.6020600@FreeBSD.org> <4F330F38.3010806@quip.cz>

next in thread | previous in thread | raw e-mail | index | archive | help

On Feb 8, 2012, at 7:11 PM, Miroslav Lachman wrote:

> Andriy Gapon wrote:
>> on 08/02/2012 12:31 Eugene M. Zheganin said the following:
>>> Hi.
>>>=20
>>> On 08.02.2012 02:17, Andriy Gapon wrote:
>>>> [output snipped]
>>>>=20
>>>> Thank you.  I don't see anything suspicious/unusual there.
>>>> Just case, do you have ZFS dedup enabled by a chance?
>>>>=20
>>>> I think that examination of vmstat -m and vmstat -z outputs may =
provide some
>>>> clues as to what got all that memory wired.
>>>>=20
>>> Nope, I don't have deduplication feature enabled.
>>=20
>> OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?
>>=20
>>> By the way, today, after eating another 100M of wired memory this =
server hanged
>>> out with multiple non-stopping messages
>>>=20
>>> swap_pager: indefinite wait buffer
>>>=20
>>> Since it's swapping on zvol, it looks to me like it could be the =
mentioned in
>>> another thread here ("Swap on zvol - recommendable?") resource =
starvation issue;
>>> may be it happens faster when the ARC isn't limited.
>>=20
>> It could be very well possible that swap on zvol doesn't work well =
when the
>> kernel itself is starved on memory.
>>=20
>>> So I want to ask - how to report it and what should I include in =
such pr ?
>>=20
>> I am leaving swap-on-zvol issue aside.  Your original problem doesn't =
seem to be
>> ZFS-related.  I suspect that you might be running into some kernel =
memory leak.
>>  If you manage to reproduce the high wired value again, then vmstat =
-m and
>> vmstat -z may provide some useful information.
>>=20
>> In this vein, do you use any out-of-tree kernel modules?
>> Also, can you try to monitor your system to see when wired count =
grows?
>=20
> I am seeing something similar on one of our machine. This is old 7.3 =
with ZFS v13, that's why I did not reported it.
>=20
> The machine is used as storage for backups made by rsync. All is =
running fine for about 107 days. Then backups are slower and slower =
because of some strange memory situation.
>=20
> Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M =
Free
>=20
> ARC Size:
>         Current Size:             1769 MB (arcsize)
>         Target Size (Adaptive):   512 MB (c)
>         Min Size (Hard Limit):    512 MB (zfs_arc_min)
>         Max Size (Hard Limit):    3584 MB (zfs_arc_max)
>=20
> The target size is going down to the min size and after few more days, =
the system is so slow, that I must reboot the machine. Then it is =
running fine for about 107 days and then it all repeat again.
>=20
> You can see more on MRTG graphs
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
> You can see links to other useful informations on top of the page =
(arc_summary, top, dmesg, fs usage, loader.conf)
>=20
> There you can see nightly backups (higher CPU load started at 01:13), =
otherwise the machine is idle.
>=20
> It coresponds with ARC target size lowering in last 5 days
> =
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcst=
ats_size.html
>=20
> And with ARC metadata cache overflowing the limit in last 5 days
> =
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_m=
eta.html

I'm not having luck finding it, but there's some known issue that exists =
even in 8.2 where some 32-bit counter overflows or something. I don't =
truly remember the logic in it, but when you hit it, it's around 110 =
days or so.  Before it gets really bad (to the point where you either =
reboot or get some memory exhaustion panic), you can see zfs "evict =
skips" incrementing rapidly.  Looking at that graph, that would be my =
guess as to what's happening to you.  It's easy to check - run one of =
the arc stats scripts, look for "evict_skips", note the number and then =
run it a few minutes later.  If it increases by more than a few hundred, =
you've hit the bug.  You'll find at that point the kernel is no longer =
"evicting" ARC from the kernel and it will just continue to grow until =
bad things happen.

Charles

>=20
> I don't know what's going on and I don't know if it is something know =
/ fixed in newer releases. We are running a few more ZFS systems on 8.2 =
without this issue. But those systems are in different roles.
>=20
> Miroslav Lachman
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to =
"freebsd-stable-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E251C55F-9A35-457C-B876-E0D1029DD6C0>