From owner-freebsd-stable@freebsd.org Wed Feb 13 12:04:39 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4612914EDA7E for ; Wed, 13 Feb 2019 12:04:39 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [IPv6:2a01:4f8:d12:604::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 712CE8744C; Wed, 13 Feb 2019 12:04:28 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13:0:0:0:5]) by hz.grosbein.net (8.15.2/8.15.2) with ESMTPS id x1DC4G5d080089 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 13 Feb 2019 13:04:16 +0100 (CET) (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: markj@freebsd.org Received: from [10.58.0.4] (dadv@[10.58.0.4]) by eg.sd.rdtc.ru (8.15.2/8.15.2) with ESMTPS id x1DC4Fev061720 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Wed, 13 Feb 2019 19:04:15 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: 11.2-STABLE kernel wired memory leak To: Mark Johnston References: <20190212163446.GA29847@raichu> Cc: FreeBSD stable From: Eugene Grosbein Message-ID: <9a263536-45e3-c690-e45e-d8ece7d1f388@grosbein.net> Date: Wed, 13 Feb 2019 19:04:14 +0700 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20190212163446.GA29847@raichu> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00,LOCAL_FROM,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Report: * -2.3 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * -0.0 SPF_PASS SPF: sender matches SPF record * 2.6 LOCAL_FROM From my domains X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on hz.grosbein.net X-Rspamd-Queue-Id: 712CE8744C X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; spf=permerror (mx1.freebsd.org: domain of eugen@grosbein.net uses mechanism not recognized by this client) smtp.mailfrom=eugen@grosbein.net X-Spamd-Result: default: False [-3.70 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; URIBL_BLOCKED(0.00)[grosbein.net.multi.uribl.com]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[grosbein.net]; MX_INVALID(0.50)[greylisted]; RCVD_COUNT_THREE(0.00)[3]; IP_SCORE(-1.36)[ip: (-2.18), ipnet: 2a01:4f8::/29(-2.39), asn: 24940(-2.24), country: DE(-0.01)]; TO_DN_ALL(0.00)[]; R_SPF_PERMFAIL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.74)[-0.744,0]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:2a01:4f8::/29, country:DE]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Feb 2019 12:04:39 -0000 12.02.2019 23:34, Mark Johnston wrote: > On Tue, Feb 12, 2019 at 11:14:31PM +0700, Eugene Grosbein wrote: >> Hi! >> >> Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel wired memory over 81 days uptime >> out of 8GB total RAM. >> >> Details follow. >> >> I have a workstation running Xorg, Firefox, Thunderbird, LibreOffice and occasionally VirtualBox for single VM. >> >> It has two identical 320GB HDDs combined with single graid-based array with "Intel" >> on-disk format having 3 volumes: >> - one "RAID1" volume /dev/raid/r0 occupies first 10GB or each HDD; >> - two "SINGLE" volumes /dev/raid/r1 and /dev/raid/r2 that utilize "tails" of HDDs (310GB each). >> >> /dev/raid/r0 (10GB) has MBR partitioning and two slices: >> - /dev/raid/r0s1 (8GB) is used for swap; >> - /dev/raid/r0s2 (2GB) is used by non-redundant ZFS pool named "os" that contains only >> root file system (177M used) and /usr file system (340M used). >> >> There is also second pool (ZMIRROR) named "z" built directly on top of /dev/raid/r[12] volumes, >> this pool contains all other file systems including /var, /home, /usr/ports, /usr/local, /usr/{src|obj} etc. >> >> # zpool list >> NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT >> os 1,98G 520M 1,48G - - 55% 25% 1.00x ONLINE - >> z 288G 79,5G 209G - - 34% 27% 1.00x ONLINE - >> >> This way I have swap outside of ZFS, boot blocks and partitioning mirrored by means of GEOM_RAID and >> can use local console to break to single user mode to unmount all file system other than root and /usr >> and can even export bigger ZFS pool "z". And I did that to see that ARC usage >> (limited with vfs.zfs.arc_max="3G" in /boot/loader.conf) dropped from over 2500MB >> down to 44MB but "Wired" stays high. Now after I imported "z" back and booted to multiuser mode >> top(1) shows: >> >> last pid: 51242; load averages: 0.24, 0.16, 0.13 up 81+02:38:38 22:59:18 >> 104 processes: 1 running, 103 sleeping >> CPU: 0.0% user, 0.0% nice, 0.4% system, 0.2% interrupt, 99.4% idle >> Mem: 84M Active, 550M Inact, 4K Laundry, 4689M Wired, 2595M Free >> ARC: 273M Total, 86M MFU, 172M MRU, 64K Anon, 1817K Header, 12M Other >> 117M Compressed, 333M Uncompressed, 2.83:1 Ratio >> Swap: 8192M Total, 940K Used, 8191M Free >> >> I have KDB and DDB in my custom kernel also. How do I debug the leak further? >> >> I use nvidia-driver-340-340.107 driver for GK208 [GeForce GT 710B] video card. >> Here are outputs of "vmstat -m": http://www.grosbein.net/freebsd/leak/vmstat-m.txt >> and "vmstat -z": http://www.grosbein.net/freebsd/leak/vmstat-z.txt > > I suspect that the "leaked" memory is simply being used to cache UMA > items. Note that the values in the FREE column of vmstat -z output are > quite large. The cached items are reclaimed only when the page daemon > wakes up to reclaim memory; if there are no memory shortages, large > amounts of memory may accumulate in UMA caches. In this case, the sum > of the product of columns 2 and 5 gives a total of roughly 4GB cached. After another day with mostly idle system, "Wired" increased to more than 6GB out of 8GB total. I've tried to increase sysctl vm.v_free_min from default 12838 (50MB) upto 1048576 (4GB) and "Wired" dropped a bit but it is still huge, 5060M: last pid: 61619; load averages: 1.05, 0.78, 0.40 up 81+22:33:09 18:53:49 119 processes: 1 running, 118 sleeping CPU: 0.0% user, 0.0% nice, 50.0% system, 0.0% interrupt, 50.0% idle Mem: 47M Active, 731M Inact, 4K Laundry, 5060M Wired, 2080M Free ARC: 3049M Total, 216M MFU, 2428M MRU, 64K Anon, 80M Header, 325M Other 2341M Compressed, 5874M Uncompressed, 2.51:1 Ratio Swap: 8192M Total, 940K Used, 8191M Free # sysctl vm.v_free_min vm.v_free_min: 1048576 # sysctl vm.stats.vm.v_free_count vm.stats.vm.v_free_count: 533232 ARC probably cached results of nightly periodic jobs traversing file system trees and hit its limit (3G). Still cannot understand where have another 2G (5G-3G) of wired memory leaked to? USED: # vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$4/1024/1024, $1}' | sort -k1,1 -rn | head 2763,2 abd_chunk 196,547 zio_buf_16384 183,711 dnode_t 128,304 zio_buf_512 96,3062 VNODE 79,0076 arc_buf_hdr_t_full 66,5 zio_data_buf_131072 63,0772 UMA Slabs 61,6484 256 61,2564 dmu_buf_impl_t FREE: # vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort -k1,1 -rn | head 245,301 dnode_t 209,086 zio_buf_512 110,163 dmu_buf_impl_t 31,2598 64 21,656 256 10,9262 swblk 10,6295 128 9,0379 RADIX NODE 8,54521 L VFS Cache 7,4917 512