Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 22 Aug 2010 16:52:00 -0700
From:      Artem Belevich <fbsdlist@src.cx>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, zfs-devel@freebsd.org
Subject:   Re: ZFS arc_reclaim_needed: better cooperation with pagedaemon
Message-ID:  <AANLkTinreSt_Dk_J5vpZ6xrs=snqYu8zKfO0X6H-x_n3@mail.gmail.com>
In-Reply-To: <4C719AB9.9020006@freebsd.org>
References:  <4C719AB9.9020006@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Do you by any chance have a graph showing kstat.zfs.misc.arcstats.size
behavior in addition to the stuff included on your graphs now?  All I
can tell from your graphs is that v_free_count+v_cache_count shifted a
bit lower relative to v_free_target+v_cache_min. It would be
interesting to see what effect your patch has on ARC itself,
especially when ARC will start giving up memory and when does it stop
shrinking.

--Artem



On Sun, Aug 22, 2010 at 2:46 PM, Andriy Gapon <avg@freebsd.org> wrote:
>
> I propose that the following code in arc_reclaim_needed
> (sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c)
> /*
> =A0* If pages are needed or we're within 2048 pages
> =A0* of needing to page need to reclaim
> =A0*/
> if (vm_pages_needed || (vm_paging_target() > -2048))
>
> be changed to
>
> if (vm_paging_needed())
>
> Rationale.
>
> 1. Why not current checks.
>
> ARC sizing should cooperate with pagedaemon in freeing pages.
> If ARC starts shrinking "prematurely", before pagedaemon is waked up then=
 no
> potentially eligible inactive pages will be recycled and no potentially e=
ligible
> active pages will be inactive (subject to v_inactive_target).
> This would lead to ARC size going to its minimum value (which could hurt =
ZFS
> performance). =A0Only after that there is a chance that pagedaemon would =
be waked
> up to do its cleaning.
> And conversely, if ARC doesn't shrink in time, then pagedaemon would have=
 to
> recycle pages with data that could be needed again soon and that would le=
ad to
> excessive swapping and disk I/O.
>
> vm_paging_target() is used only by pagedaemon internally, it effectively =
sets
> _upper_ limit on how many pages pagedaemon would free when it's activated=
.
> It is no indication of whether pagedaemon should be scanning/freeing page=
s.
> Thus check of vm_paging_target() leads to premature ARC shrinking.
> I believe that many people observe this behavior on sufficiently active s=
ystems
> (not dedicated file servers) with few GB of RAM (1-8).
>
> vm_pages_needed check is redundant, because this is a flag that is used t=
o wake
> up pagedaemon. =A0So when it is set vm_paging_needed() is true and
> vm_paging_target() is "way" above zero. =A0And this flag is reset to zero=
 when
> vm_page_count_min() becomes false, which corresponds to even fewer free p=
ages
> than when vm_paging_needed() is true.
>
>
> 2. Why the new check.
>
> vm_paging_needed() is the (earliest) condition that wakes up pagedaemon (=
see
> vm_page_alloc). =A0pagedaemon would first of all run vm_lowmem event for =
which ARC
> already has a handler and which would cause ARC size to shrink.
> It would seems like having vm_paging_needed() check would be redundant th=
en.
> Almost - if memory pressure is significant, then vm_paging_needed() may s=
tay
> true for a while and that would cause additional ARC reduction by
> arc_reclaim_thread.
>
>
> Final notes.
>
> I think that
> vm_paging_target() > -2048
> check was modeled after the check in the original OpenSolaris code:
> freemem < lotsfree + needfree + extra
>
> The issue is that in my understanding OpenSolaris pagedaemon works differ=
ently
> from FreeBSD pagedaemon.
>
> OpenSolaris pagedaemon is activated when freemem (equivalent of our free =
+
> cache) falls down to a certain higher mark (lotsfree). =A0Initially it sc=
ans pages
> at a slow rate. =A0If freemem falls further the rate linearly increases u=
ntil it
> reaches its maximum when freemem goes to or below certain lower mark.
>
> Our pagedaemon is activated when free + cache falls down to a value when
> vm_paging_needed() is true (see definition of this function). =A0When it =
is
> activated it makes a scan pass though inactive and active pages setting a
> certain target for free+cache, but that target is "soft" and actually is =
an
> upper limit of how many pages could be freed during the pass. pagedaemon =
would
> make the second (or subsequent) pass only if free+cache falls to value th=
at is
> even lower than the threshold in vm_paging_needed(), which means signific=
ant
> (severe even) memory pressure/shortage.
> So on sufficiently active system free+cache would typically oscillate bet=
ween
> v_free_reserved+v_cache_min at the bottom and some semi-random values "ne=
ar"
> v_free_target+v_cache_min at the tops. =A0That's when excluding ARC from =
the picture.
>
> And about pictures :-)
> Behavior of free+cache with current arc_reclaim_needed code:
> http://people.freebsd.org/~avg/avail-mem-before.png
> and its behavior after the patch:
> http://people.freebsd.org/~avg/avail-mem-after.png
>
> The legends on the pictures are incorrect, sorry, my mastery of drraw is =
not
> good yet.
> Correct legends:
> "aqua" color - v_free_target+v_cache_min (vm_paging_target() =3D=3D 0)
> "fuchsia" color - v_free_reserved+v_cache_min (vm_paging_needed() thresho=
ld)
> "lime" color - v_free_count+v_cache_count indeed :)
> Y axis - % of total page count.
>
> I think the graphs speak for themselves.
>
> --
> Andriy Gapon
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org=
"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTinreSt_Dk_J5vpZ6xrs=snqYu8zKfO0X6H-x_n3>