From owner-freebsd-current@FreeBSD.ORG Fri Jan 31 18:45:42 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 95E77E90 for ; Fri, 31 Jan 2014 18:45:42 +0000 (UTC) Received: from hell.ukr.net (hell.ukr.net [212.42.67.68]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 510D31E09 for ; Fri, 31 Jan 2014 18:45:41 +0000 (UTC) Received: from satan by hell.ukr.net with local ID 1W9InV-000LTA-RA ; Fri, 31 Jan 2014 20:26:37 +0200 Date: Fri, 31 Jan 2014 20:26:37 +0200 From: Vitalij Satanivskij To: Vladimir Sharun Subject: Re: ARC "pressured out", how to control/stabilize ? (reformatted to text/plain) Message-ID: <20140131182637.GA82526@hell.ukr.net> References: <52C93E4D.1050100@FreeBSD.org> <1389005433.815055146.2dcjke36@frv45.ukr.net> <52CA9963.1050507@FreeBSD.org> <1389676958.516993176.oq4lbgg7@frv45.ukr.net> <52D59E36.9040405@FreeBSD.org> <20140115102837.GA98983@hell.ukr.net> <52D66DB6.7030807@FreeBSD.org> <1390900795.258244476.v35k1338@frv45.ukr.net> <52EA3459.3070300@FreeBSD.org> <1391083826.948700370.cmzf8475@frv45.ukr.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1391083826.948700370.cmzf8475@frv45.ukr.net> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: Current FreeBSD X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jan 2014 18:45:42 -0000 Dear Andriy and FreeBSD community, Build world with path failed with error /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:4642:13: error: use of undeclared identifier 'l2hdr' ASSERT3P(l2hdr->b_tmp_cdata, ==, NULL); ^ /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/sys/debug.h:125:40: note: expanded from macro 'ASSERT3P' #define ASSERT3P(x, y, z) VERIFY3_IMPL(x, y, z, uintptr_t) ^ /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/sys/debug.h:109:29: note: expanded from macro 'VERIFY3_IMPL' const TYPE __left = (TYPE)(LEFT); \ ^ 1 error generated. *** Error code 1 Vladimir Sharun wrote: VS> Dear Andriy and FreeBSD community, VS> VS> L2ARC temporarily turned off by setting secondarycache=none everywhere it was enabled, VS> so no more leak for one particular day. VS> VS> Here's the top header: VS> last pid: 89916; load averages: 2.49, 2.91, 2.89 up 5+19:21:42 14:09:12 VS> 561 processes: 2 running, 559 sleeping VS> CPU: 5.7% user, 0.0% nice, 14.0% system, 1.0% interrupt, 79.3% idle VS> Mem: 23G Active, 1017M Inact, 98G Wired, 1294M Cache, 3285M Buf, 1997M Free VS> ARC: 69G Total, 3498M MFU, 59G MRU, 53M Anon, 1651M Header, 4696M Other VS> Swap: VS> VS> Here's the calculated vmstat -z (mean all of the allocations, which exceeds 100*1024^2 printed): VS> UMA Slabs: 199,915M VS> VM OBJECT: 207,354M VS> 32: 205,558M VS> 64: 901,122M VS> 128: 215,211M VS> 256: 242,262M VS> 4096: 2316,01M VS> range_seg_cache: 205,396M VS> zio_buf_512: 1103,31M VS> zio_buf_16384: 15697,9M VS> zio_data_buf_16384: 348,297M VS> zio_data_buf_24576: 129,352M VS> zio_data_buf_32768: 104,375M VS> zio_data_buf_36864: 163,371M VS> zio_data_buf_53248: 100,496M VS> zio_data_buf_57344: 105,93M VS> zio_data_buf_65536: 101,75M VS> zio_data_buf_73728: 111,938M VS> zio_data_buf_90112: 104,414M VS> zio_data_buf_106496: 100,242M VS> zio_data_buf_131072: 61652,5M VS> dnode_t: 3203,98M VS> dmu_buf_impl_t: 797,695M VS> arc_buf_hdr_t: 1498,76M VS> arc_buf_t: 105,802M VS> zfs_znode_cache: 352,61M VS> VS> zio_data_buf_131072 (61652M) + zio_buf_16384 (15698M) = 77350M VS> easily exceeds ARC total (70G) VS> VS> VS> Here's the same calculations from exact the same system where L2 was disabled before reboot: VS> last pid: 63407; load averages: 2.35, 2.71, 2.73 up 8+19:42:54 14:17:33 VS> 527 processes: 1 running, 526 sleeping VS> CPU: 4.8% user, 0.0% nice, 6.6% system, 1.1% interrupt, 87.4% idle VS> Mem: 21G Active, 1460M Inact, 99G Wired, 1748M Cache, 3308M Buf, 952M Free VS> ARC: 87G Total, 4046M MFU, 76G MRU, 37M Anon, 2026M Header, 4991M Other VS> Swap: VS> VS> and the vmstat -z filtered: VS> UMA Slabs: 208,004M VS> VM OBJECT: 207,392M VS> 32: 172,831M VS> 64: 752,226M VS> 128: 210,024M VS> 256: 244,204M VS> 4096: 2249,02M VS> range_seg_cache: 245,711M VS> zio_buf_512: 1145,25M VS> zio_buf_16384: 15170,1M VS> zio_data_buf_16384: 422,766M VS> zio_data_buf_20480: 120,742M VS> zio_data_buf_24576: 148,641M VS> zio_data_buf_28672: 112,848M VS> zio_data_buf_32768: 117,375M VS> zio_data_buf_36864: 185,379M VS> zio_data_buf_45056: 103,168M VS> zio_data_buf_53248: 105,32M VS> zio_data_buf_57344: 122,828M VS> zio_data_buf_65536: 109,25M VS> zio_data_buf_69632: 100,406M VS> zio_data_buf_73728: 126,844M VS> zio_data_buf_77824: 101,086M VS> zio_data_buf_81920: 100,391M VS> zio_data_buf_86016: 101,391M VS> zio_data_buf_90112: 112,836M VS> zio_data_buf_98304: 100,688M VS> zio_data_buf_102400: 106,543M VS> zio_data_buf_106496: 108,875M VS> zio_data_buf_131072: 63190,5M VS> dnode_t: 3437,36M VS> dmu_buf_impl_t: 840,62M VS> arc_buf_hdr_t: 1870,88M VS> arc_buf_t: 114,942M VS> zfs_znode_cache: 353,055M VS> VS> Everything seems within ARC total range. VS> VS> We will try patch attached within few days and will come back with the result. VS> VS> Thank you for your help. VS> VS> > on 28/01/2014 11:28 Vladimir Sharun said the following: VS> > > Dear Andriy and FreeBSD community, VS> > > VS> > > After applying this path one of the systems runs fine (disk subsystem load low to moderate VS> > > - 10-20% busy sustained), VS> > > VS> > > Then I saw this patch was merged to the HEAD and we apply it to the one of the systems VS> > > with moderate to high disk load: 30-60% busy (11.0-CURRENT #7 r261118: Fri Jan 24 17:25:08 EET 2014) VS> > > VS> > > Within 4 days we experiencing the same leak(?) as without patch: VS> > > VS> > > last pid: 53841; load averages: 4.47, 4.18, 3.78 up 3+16:37:09 11:24:39 VS> > > 543 processes: 6 running, 537 sleeping VS> > > CPU: 8.7% user, 0.0% nice, 14.6% system, 1.4% interrupt, 75.3% idle VS> > > Mem: 22G Active, 1045M Inact, 98G Wired, 1288M Cache, 3284M Buf, 2246M Free VS> > > ARC: 73G Total, 3763M MFU, 62G MRU, 56M Anon, 1887M Header, 4969M Other VS> > > Swap: VS> > > VS> > > The ARC is populated within 30mins under load to the max (90Gb) then start decreasing. VS> > > VS> > > The delta between Wiread and ARC total start growing from typical 10-12Gb without L2 enabled VS> > > to the 25Gb with L2 enabled and counting (4 hours ago was 22Gb delta). VS> > VS> > First, have you checked that vmstat -z output contains the same anomaly as for VS> > in your original report? VS> > VS> > If yes, the please try to reproduce the problem with the following debugging patch: VS> > http://people.freebsd.org/~avg/l2arc-b_tmp_cdata-diag.patch VS> > Please make sure to compile your kernel (and modules) with INVARIANTS. VS> > VS> > -- VS> > Andriy Gapon VS> > _______________________________________________ VS> > freebsd-current@freebsd.org mailing list VS> > http://lists.freebsd.org/mailman/listinfo/freebsd-current VS> > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" VS> _______________________________________________ VS> freebsd-current@freebsd.org mailing list VS> http://lists.freebsd.org/mailman/listinfo/freebsd-current VS> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"