From owner-freebsd-fs@freebsd.org Fri Aug 19 20:18:49 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C9F9EBBE694 for ; Fri, 19 Aug 2016 20:18:49 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8A4AB1592 for ; Fri, 19 Aug 2016 20:18:49 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1baqFU-0003gK-Kd; Fri, 19 Aug 2016 23:18:40 +0300 Date: Fri, 19 Aug 2016 23:18:40 +0300 From: Slawa Olhovchenkov To: Karl Denninger , freebsd-fs@freebsd.org Subject: Re: ZFS ARC under memory pressure Message-ID: <20160819201840.GA12519@zxy.spb.ru> References: <20160816193416.GM8192@zxy.spb.ru> <8dbf2a3a-da64-f7f8-5463-bfa23462446e@FreeBSD.org> <20160818202657.GS8192@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Aug 2016 20:18:49 -0000 On Thu, Aug 18, 2016 at 03:31:26PM -0500, Karl Denninger wrote: > > On 8/18/2016 15:26, Slawa Olhovchenkov wrote: > > On Thu, Aug 18, 2016 at 11:00:28PM +0300, Andriy Gapon wrote: > > > >> On 16/08/2016 22:34, Slawa Olhovchenkov wrote: > >>> I see issuses with ZFS ARC inder memory pressure. > >>> ZFS ARC size can be dramaticaly reduced, up to arc_min. > >>> > >>> As I see memory pressure event cause call arc_lowmem and set needfree: > >>> > >>> arc.c:arc_lowmem > >>> > >>> needfree = btoc(arc_c >> arc_shrink_shift); > >>> > >>> After this, arc_available_memory return negative vaules (PAGESIZE * > >>> (-needfree)) until needfree is zero. Independent how too much memory > >>> freed. needfree set to 0 in arc_reclaim_thread(), when arc_size <= > >>> arc_c. Until arc_size don't drop below arc_c (arc_c deceased at every > >>> loop interation). > >>> > >>> arc_c droped to minimum value if arc_size fast enough droped. > >>> > >>> No control current to initial memory allocation. > >>> > >>> As result, I can see needless arc reclaim, from 10x to 100x times. > >>> > >>> Can some one check me and comment this? > >> You might have found a real problem here, but I am short of time right now to > >> properly analyze the issue. I think that on illumos 'needfree' is a variable > >> that's managed by the virtual memory system and it is akin to our > >> vm_pageout_deficit. But during the porting it became an artificial value and > >> its handling might be sub-optimal. > > As I see, totaly not optimal. > > I am create some patch for sub-optimal handling and now test it. > > _______________________________________________ > > freebsd-fs at freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org" > > You might want to look at the code contained in here: > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594 In may case arc.c issuse cused by revision r286625 in HEAD (and r288562 in STABLE) -- all in 2015, not touch in 2014. > There are some ugly interactions with the VM system you can run into if > you're not careful; I've chased this issue before and while I haven't > yet done the work to integrate it into 11.x (and the underlying code > *has* changed since the 10.x patches I developed) if you wind up driving > the VM system to evict pages to swap rather than pare back ARC you're > probably making the wrong choice. > > In addition UMA can come into the picture too and (at least previously) > was a severe contributor to pathological behavior. I am only do less aggresive (and more controlled) shrink of ARC size. Now ARC just collapsed. Pointed PR is realy BIG. I am can't read and understund all of this. r286625 change behaivor of interaction between ARC and VM. You problem still exist? Can you explain (in list)? -- Slawa Olhovchenkov