From owner-freebsd-fs@freebsd.org Wed May 22 17:38:39 2019 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 645FE15B2E8B; Wed, 22 May 2019 17:38:39 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EDAC790784; Wed, 22 May 2019 17:38:37 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1hTVCF-0004ew-I2; Wed, 22 May 2019 20:38:35 +0300 Date: Wed, 22 May 2019 20:38:35 +0300 From: Slawa Olhovchenkov To: Alexander Motin Cc: lev@FreeBSD.org, Mark Johnston , freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Commit r345200 (new ARC reclamation threads) looks suspicious to me - second potential problem Message-ID: <20190522173835.GB2161@zxy.spb.ru> References: <369cb1e9-f36a-a558-6941-23b9b811825a@FreeBSD.org> <20190520164202.GA2130@spy> <28c7430b-fb7c-6472-5c1b-fa3ff63a9e73@FreeBSD.org> <94d051a3-3427-7a5b-efe7-169cff2265d3@FreeBSD.org> <2a50e192-e672-7c87-178b-afd509a765df@FreeBSD.org> <20190522161945.GE47119@zxy.spb.ru> <5ea6d9bc-4fd3-d8e6-adf4-513b4edc71e3@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5ea6d9bc-4fd3-d8e6-adf4-513b4edc71e3@FreeBSD.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-Rspamd-Queue-Id: EDAC790784 X-Spamd-Bar: +++ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [3.68 / 15.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_SPAM_SHORT(0.95)[0.954,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[zxy.spb.ru]; AUTH_NA(1.00)[]; RCPT_COUNT_FIVE(0.00)[5]; NEURAL_SPAM_MEDIUM(0.91)[0.913,0]; IP_SCORE(0.00)[country: RU(0.01)]; MX_GOOD(-0.01)[cached: zxy.spb.ru]; NEURAL_SPAM_LONG(0.92)[0.922,0]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:5495, ipnet:195.70.192.0/19, country:RU]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2019 17:38:39 -0000 On Wed, May 22, 2019 at 12:49:48PM -0400, Alexander Motin wrote: > On 22.05.2019 12:19, Slawa Olhovchenkov wrote: > > On Wed, May 22, 2019 at 12:07:29PM -0400, Alexander Motin wrote: > > > >> On 22.05.2019 11:50, Lev Serebryakov wrote: > >>> On 22.05.2019 18:19, Alexander Motin wrote: > >>> > >>>>>> But looks like `arc_kmem_reap_soon()` is synchronous on FreeBSD! So, > >>>>>> this `delay()` looks very wrong. Am I right? > >>>> > >>>> Why is it wrong? > >>> One second pause after synchronous operation to wait it completion? > >> > >> No. To rate-throttle them. This gives UMA a second to get back into > >> minimally steady state after we ripped all caches from it. As I have > >> told, we do not want to drain caches constantly in a tight loop, we want > >> more or less steady state. > > > > And also (posible) additionaly delay arc_get_data_impl(). > > arc_get_data_impl() depends on arc_adjust_zthr, not on arc_reap_zthr, so > it should not get blocked by this delay. That was the motivation for > the threads splitting in the last rewrite. next case: system under memory pressure, no memory in UMA cache. arc_get_data_impl() see arc_size >= arc_c+overflow (arc_is_overflowing()) and wait arc_adjust_zthr. arc_adjust_zthr in arc_adjust_cb() do arc_adjust() and evict small amount, arc_size still over arc_c (I mean this is posible: arc_c drop by arc_lowmem or arc_szie rised by paralel arc_get_data_impl()), cv_broadcast() don't called. next round is wait work arc_reap_zthr activated by arc_lowmem or timeout: arc_kmem_reap_soon(); dealy(); arc_reduce_target_size() (under memory pressure (arc_c >> arc_shrink_shift) > arc_available_memory() is true) arc_reduce_target_size() re-activate arc_adjust_zthr (asize > arc_c is true). delay() in this path unnecessary and slowly arc_get_data_impl(). > > This is incorrectly throttling implementation. > > I am not particularly defending ZFS doing its own reclamation, I'd more > trust pagedaemon, but so far I haven't seen any memory pressure issues > after I committed this. > > -- > Alexander Motin