FreeBSD Mail Archives

Date:      Tue, 28 Sep 2010 18:01:21 -0400
From:      Ben Kelly <ben@wanderview.com>
To:        Andriy Gapon <avg@icyb.net.ua>
Cc:        stable@freebsd.org, fs@freebsd.org
Subject:   Re: Still getting kmem exhausted panic
Message-ID:  <5BD33772-C0EA-48A9-BE9A-C8FBAF0008D7@wanderview.com>
In-Reply-To: <4CA25E92.4060904@icyb.net.ua>
References:  <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <FE116FEC-714D-4BF5-86D8-E29BFA713C69@wanderview.com> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> <4CA22337.2010900@icyb.net.ua> <F244BA6D-3347-4D76-BAFB-D8B975783877@wanderview.com> <4CA25E92.4060904@icyb.net.ua>

On Sep 28, 2010, at 5:30 PM, Andriy Gapon wrote:

<< snipped lots of good info here... probably won't have time to look at it in detail until the weekend >>

>> there seems to be a layering violation in that the buffer cache signals
>> directly to the upper page daemon layer to trigger page reclamation.)
> 
> Umm, not sure if that is a fact.

I was referring to the code in vfs_bio.c that used to twiddle vm_pageout_deficit directly.  That seems to have been replaced with a call to vm_page_grab().

>> The old (ancient) patch I tried previously to help reduce the arc working set
>> and allow it to shrink is here:
>> 
>> http://www.wanderview.com/svn/public/misc/zfs/zfs_kmem_limit.diff
>> 
>> Unfortunately, there are a couple ideas on fighting fragmentation mixed into
>> that patch.  See the part about arc_reclaim_pages().  This patch did seem to
>> allow my arc to stay under the target maximum even when under load that
>> previously caused the system to exceed the maximum.  When I update this
>> weekend I'll try a stripped down version of the patch to see if it helps or
>> not with the latest zfs.
>> 
>> Thanks for your help in understanding this stuff!
> 
> The patch seems good, especially the part about taking into account the kmem
> fragmentation.  But it also seems to be heavily tuned towards "tiny ARC" systems
> like yours, so I am not sure yet how suitable it is for "mainstream" systems.

Thanks.  Yea, there is a lot of aggressive tuning there.  In particular, the slow growth algorithm is somewhat dubious.  What I found, though, was that the fragmentation jumped whenever the arc was reduced in size, so it was an attempt to make the size slowly approach peak load without overshooting.

A better long term solution would probably be to enhance UMA to support custom slab sizes on a zone-by-zone basis.  That way all zfs/arc allocations can use slabs of 128k (at a memory efficiency penalty of course).  I prototyped this with a dumbed down block pool allocator at one point and was able to avoid most, if not all, of the fragmentation.  Adding the support to UMA seemed non-trivial, though.

Thanks again for the information.  I hope to get a chance to look at the code this weekend.

- Ben

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5BD33772-C0EA-48A9-BE9A-C8FBAF0008D7>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation