Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Sep 2010 19:30:01 +0300
From:      Andriy Gapon <avg@icyb.net.ua>
To:        Ben Kelly <ben@wanderview.com>
Cc:        stable@freebsd.org, fs@freebsd.org
Subject:   Re: Still getting kmem exhausted panic
Message-ID:  <4CA21809.7090504@icyb.net.ua>
In-Reply-To: <FE116FEC-714D-4BF5-86D8-E29BFA713C69@wanderview.com>
References:  <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <FE116FEC-714D-4BF5-86D8-E29BFA713C69@wanderview.com>

next in thread | previous in thread | raw e-mail | index | archive | help
on 28/09/2010 18:50 Ben Kelly said the following:
> 
> On Sep 28, 2010, at 9:36 AM, Andriy Gapon wrote:
>> Well, no time for me to dig through all that history. arc_max should be a
>> hard limit and it is now. If it ever wasn't then it was a bug.
> 
> I believe the size of the arc could exceed the limit if your working set was
> larger than arc_max.  The arc can't (couldn't then, anyway) evict data that is
> still referenced.

I think that you are correct and I was wrong.
ARC would still allocate a new buffer even if it's at or above arc_max and can not
re-use any exisiting buffer.
But I think that this is more likely to happen with "tiny" ARC size.  I have hard
time imagining a workload at which gigabytes of data would be simultaneously and
continuously used (see below for definition of "used").

> A contributing factor at the time was that the page daemon did not take into
> account back pressure from the arc when deciding which pages to move from
> active to inactive, etc.  So data was more likely to be referenced and
> therefore forced to remain in the arc.

I don't think that this is what happened and I don't think that pagedaemon has
anything to do with the discussed issue.
I think that ARC buffers exist independently of pagedaemon and page cache.
I think that they are held only during time when I/O is happening to or from them.

> I'm not sure if this is still the current state.  I seem to remember some
> changesets mentioning arc back pressure at some point, but I don't know the
> details.

I think that backpressure has nothing to do with it.
If ZFS truly does I/O with all existing buffers and it needs a new buffer, then
the choices are limited: either block and wait, or go over the limit.
Apparently ZFS designers went with the latter option.

But as I've said, for non-tiny ARC sizes it's hard to imagine such amount of
parallel I/O that would tie all ARC buffers.  Given the adaptive nature of ARC I
still see it happening, but only when ARC size is near its minimum, not when it is
at maximum.

It seems that kstat.zfs.misc.arcstats.recycle_miss is a counter of allocations
when ARC refused to grow and no existing buffer could be recycled, but this is not
the same as going above ARC maximum size.

BTW, such allocation over the limit could be considered as a form of memory
pressure from ARC on the rest of the system.

P.S.
The code is in arc_get_data_buf().

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4CA21809.7090504>