From owner-freebsd-fs@FreeBSD.ORG Tue Sep 28 16:30:16 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC0DE1065694; Tue, 28 Sep 2010 16:30:16 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E12CE8FC17; Tue, 28 Sep 2010 16:30:15 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA07743; Tue, 28 Sep 2010 19:30:02 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA21809.7090504@icyb.net.ua> Date: Tue, 28 Sep 2010 19:30:01 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Ben Kelly References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, fs@freebsd.org Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 16:30:17 -0000 on 28/09/2010 18:50 Ben Kelly said the following: > > On Sep 28, 2010, at 9:36 AM, Andriy Gapon wrote: >> Well, no time for me to dig through all that history. arc_max should be a >> hard limit and it is now. If it ever wasn't then it was a bug. > > I believe the size of the arc could exceed the limit if your working set was > larger than arc_max. The arc can't (couldn't then, anyway) evict data that is > still referenced. I think that you are correct and I was wrong. ARC would still allocate a new buffer even if it's at or above arc_max and can not re-use any exisiting buffer. But I think that this is more likely to happen with "tiny" ARC size. I have hard time imagining a workload at which gigabytes of data would be simultaneously and continuously used (see below for definition of "used"). > A contributing factor at the time was that the page daemon did not take into > account back pressure from the arc when deciding which pages to move from > active to inactive, etc. So data was more likely to be referenced and > therefore forced to remain in the arc. I don't think that this is what happened and I don't think that pagedaemon has anything to do with the discussed issue. I think that ARC buffers exist independently of pagedaemon and page cache. I think that they are held only during time when I/O is happening to or from them. > I'm not sure if this is still the current state. I seem to remember some > changesets mentioning arc back pressure at some point, but I don't know the > details. I think that backpressure has nothing to do with it. If ZFS truly does I/O with all existing buffers and it needs a new buffer, then the choices are limited: either block and wait, or go over the limit. Apparently ZFS designers went with the latter option. But as I've said, for non-tiny ARC sizes it's hard to imagine such amount of parallel I/O that would tie all ARC buffers. Given the adaptive nature of ARC I still see it happening, but only when ARC size is near its minimum, not when it is at maximum. It seems that kstat.zfs.misc.arcstats.recycle_miss is a counter of allocations when ARC refused to grow and no existing buffer could be recycled, but this is not the same as going above ARC maximum size. BTW, such allocation over the limit could be considered as a form of memory pressure from ARC on the rest of the system. P.S. The code is in arc_get_data_buf(). -- Andriy Gapon