From owner-freebsd-current@FreeBSD.ORG Tue Nov 4 17:22:31 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 86670148; Tue, 4 Nov 2014 17:22:31 +0000 (UTC) Received: from mx1.scaleengine.net (beauharnois2.bhs1.scaleengine.net [142.4.218.15]) by mx1.freebsd.org (Postfix) with ESMTP id 473C5152; Tue, 4 Nov 2014 17:22:30 +0000 (UTC) Received: from [172.16.1.137] (50-206-19-250-static.hfc.comcastbusiness.net [50.206.19.250]) (Authenticated sender: allanjude.freebsd@scaleengine.com) by mx1.scaleengine.net (Postfix) with ESMTPSA id 1412F68EE7; Tue, 4 Nov 2014 17:22:29 +0000 (UTC) Message-ID: <54590B55.3040206@freebsd.org> Date: Tue, 04 Nov 2014 12:22:29 -0500 From: Allan Jude User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: freebsd-current@freebsd.org, gibbs@freebsd.org, George Kola Subject: Re: r273165. ZFS ARC: possible memory leak to Inact References: <1415098949.596412362.8vxee7kf@frv41.fwdcdn.com> <5458CCB6.7020602@multiplay.co.uk> <1415107358607-5962421.post@n5.nabble.com> In-Reply-To: <1415107358607-5962421.post@n5.nabble.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Nov 2014 17:22:31 -0000 On 11/04/2014 08:22, Dmitriy Makarov wrote: > ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > > UMA Kegs: 384, 0, 210, 10, 216, 0, 0 > UMA Zones: 2176, 0, 210, 0, 216, 0, 0 > UMA Slabs: 80, 0, 2921231, 1024519,133906002, 0, 0 > UMA RCntSlabs: 88, 0, 8442, 1863, 771451, 0, 0 > UMA Hash: 256, 0, 2, 28, 79, 0, 0 > 4 Bucket: 32, 0, 5698, 16052,424047094, 0, 0 > 6 Bucket: 48, 0, 220, 8993,77454827, 0, 0 > 8 Bucket: 64, 0, 260, 6808,56285069, 15, 0 > 12 Bucket: 96, 0, 302, 2568,42712743, 192, 0 > 16 Bucket: 128, 0, 1445, 1903,86971183, 0, 0 > 32 Bucket: 256, 0, 610, 2870,96758244, 215, 0 > 64 Bucket: 512, 0, 1611, 1117,55896361,77166469, 0 > 128 Bucket: 1024, 0, 413, 635,99338830,104451029, > 0 > 256 Bucket: 2048, 0, 1100, 222,164776092,24917372, > 0 > vmem btag: 56, 0, 1889493, 502639,30117503,16948, 0 > VM OBJECT: 256, 0, 970434, 174126,1080667061, 0, 0 > RADIX NODE: 144, 0, 2792188, 882809,1489929489, 0, 0 > MAP: 240, 0, 3, 61, 3, 0, 0 > KMAP ENTRY: 128, 0, 13, 173, 37, 0, 0 > MAP ENTRY: 128, 0, 82182, 11624,3990141990, 0, 0 > VMSPACE: 496, 0, 615, 761,41838231, 0, 0 > fakepg: 104, 0, 0, 0, 0, 0, 0 > mt_zone: 16400, 0, 261, 0, 267, 0, 0 > 16: 16, 0, 3650397, 6166213,6132198534, 0, 0 > 32: 32, 0, 1118176, 259824,9115561085, 0, 0 > 64: 64, 0,14496058,14945820,11266627738, 0, 0 > 128: 128, 0, 1337428, 319398,15463968444, 0, 0 > 256: 256, 0, 1103937, 258183,8392009677, 0, 0 > 512: 512, 0, 1714, 470,7174436957, 0, 0 > 1024: 1024, 0, 29033, 347,131133987, 0, 0 > 2048: 2048, 0, 869, 275,1001770010, 0, 0 > 4096: 4096, 0, 730319, 3013,332721996, 0, 0 > 8192: 8192, 0, 47, 11, 487154, 0, 0 > 16384: 16384, 0, 65, 5, 1788, 0, 0 > 32768: 32768, 0, 54, 13, 103482, 0, 0 > 65536: 65536, 0, 627, 8, 8172809, 0, 0 > SLEEPQUEUE: 80, 0, 1954, 1053, 2812, 0, 0 > 64 pcpu: 8, 0, 558, 594, 793, 0, 0 > Files: 80, 0, 16221, 2579,1549799224, 0, 0 > TURNSTILE: 136, 0, 1954, 506, 2812, 0, 0 > rl_entry: 40, 0, 1114, 2186, 1114, 0, 0 > umtx pi: 96, 0, 0, 0, 0, 0, 0 > MAC labels: 40, 0, 0, 0, 0, 0, 0 > PROC: 1208, 0, 635, 514,41838196, 0, 0 > THREAD: 1168, 0, 1840, 113, 12778, 0, 0 > cpuset: 96, 0, 705, 361, 1490, 0, 0 > audit_record: 1248, 0, 0, 0, 0, 0, 0 > sendfile_sync: 128, 0, 0, 0, 0, 0, 0 > mbuf_packet: 256, 46137345, 8199, 5074,15123806588, 0, > 0 > mbuf: 256, 46137345, 25761, 13076,21621129305, 0, > 0 > mbuf_cluster: 2048, 7208960, 13273, 315, 2905465, 0, 0 > mbuf_jumbo_page: 4096, 3604480, 786, 862,628074105, 0, 0 > mbuf_jumbo_9k: 9216, 1067994, 0, 0, 0, 0, 0 > mbuf_jumbo_16k: 16384, 600746, 0, 0, 0, 0, 0 > mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0, 0 > g_bio: 248, 0, 36, 2348,2894002696, 0, 0 > DMAR_MAP_ENTRY: 120, 0, 0, 0, 0, 0, 0 > ttyinq: 160, 0, 180, 195, 4560, 0, 0 > ttyoutq: 256, 0, 95, 190, 2364, 0, 0 > FPU_save_area: 832, 0, 0, 0, 0, 0, 0 > taskq_zone: 48, 0, 0, 4814,108670448, 0, 0 > VNODE: 472, 0, 1115838, 293402,379118791, 0, 0 > VNODEPOLL: 112, 0, 0, 0, 12, 0, 0 > BUF TRIE: 144, 0, 96, 105852, 5530345, 0, 0 > S VFS Cache: 108, 0, 995997, 161558,523325155, 0, 0 > STS VFS Cache: 148, 0, 0, 0, 0, 0, 0 > L VFS Cache: 328, 0, 25, 443,39533826, 0, 0 > LTS VFS Cache: 368, 0, 0, 0, 0, 0, 0 > NAMEI: 1024, 0, 4, 208,3917615385, 0, 0 > range_seg_cache: 64, 0, 2036778, 121876,1194538610, 0, 0 > zio_cache: 920, 0, 65, 15323,15366038685, 0, 0 > zio_link_cache: 48, 0, 30, 16321,12086373533, 0, 0 > zio_buf_512: 512, 0, 2713231, 2767361,807591166, 0, 0 > zio_data_buf_512: 512, 0, 481, 655,196012401, 0, 0 > zio_buf_1024: 1024, 0, 7131, 1893,34360002, 0, 0 > zio_data_buf_1024: 1024, 0, 449, 335,13698525, 0, 0 > zio_buf_1536: 1536, 0, 2478, 560,21617894, 0, 0 > zio_data_buf_1536: 1536, 0, 821, 433,17033305, 0, 0 > zio_buf_2048: 2048, 0, 1867, 373,24528179, 0, 0 > zio_data_buf_2048: 2048, 0, 710, 348,18500686, 0, 0 > zio_buf_2560: 2560, 0, 1362, 38,13483571, 0, 0 > zio_data_buf_2560: 2560, 0, 946, 47,12074257, 0, 0 > zio_buf_3072: 3072, 0, 978, 43,20528564, 0, 0 > zio_data_buf_3072: 3072, 0, 716, 57,10665806, 0, 0 > zio_buf_3584: 3584, 0, 768, 23,15883624, 0, 0 > zio_data_buf_3584: 3584, 0, 867, 7, 9497134, 0, 0 > zio_buf_4096: 4096, 0, 9982, 772,154583770, 0, 0 > zio_data_buf_4096: 4096, 0, 851, 12, 8770997, 0, 0 > zio_buf_5120: 5120, 0, 904, 24,15481475, 0, 0 > zio_data_buf_5120: 5120, 0, 1615, 19,22450665, 0, 0 > zio_buf_6144: 6144, 0, 715, 23,18561260, 0, 0 > zio_data_buf_6144: 6144, 0, 1536, 1,12377616, 0, 0 > zio_buf_7168: 7168, 0, 600, 25,22583123, 0, 0 > zio_data_buf_7168: 7168, 0, 1789, 62,10888039, 0, 0 > zio_buf_8192: 8192, 0, 527, 28,21084452, 0, 0 > zio_data_buf_8192: 8192, 0, 1123, 35,11257788, 0, 0 > zio_buf_10240: 10240, 0, 891, 40,23445358, 0, 0 > zio_data_buf_10240: 10240, 0, 2757, 10,31594664, 0, 0 > zio_buf_12288: 12288, 0, 793, 44,32778601, 0, 0 > zio_data_buf_12288: 12288, 0, 2983, 19,33810459, 0, 0 > zio_buf_14336: 14336, 0, 680, 22,22955621, 0, 0 > zio_data_buf_14336: 14336, 0, 2837, 7,31231322, 0, 0 > zio_buf_16384: 16384, 0, 1174235, 5515,423668480, 0, 0 > zio_data_buf_16384: 16384, 0, 12197, 2,23870379, 0, 0 > zio_buf_20480: 20480, 0, 1234, 42,28438855, 0, 0 > zio_data_buf_20480: 20480, 0, 3349, 10,39049709, 0, 0 > zio_buf_24576: 24576, 0, 1039, 35,23663028, 0, 0 > zio_data_buf_24576: 24576, 0, 2515, 12,32477737, 0, 0 > zio_buf_28672: 28672, 0, 872, 47,17630224, 0, 0 > zio_data_buf_28672: 28672, 0, 1746, 11,24870056, 0, 0 > zio_buf_32768: 32768, 0, 847, 29,18368605, 0, 0 > zio_data_buf_32768: 32768, 0, 1637, 11,20784299, 0, 0 > zio_buf_36864: 36864, 0, 797, 22,16120701, 0, 0 > zio_data_buf_36864: 36864, 0, 2136, 65,19999849, 0, 0 > zio_buf_40960: 40960, 0, 707, 40,14881217, 0, 0 > zio_data_buf_40960: 40960, 0, 1242, 66,18085181, 0, 0 > zio_buf_45056: 45056, 0, 718, 43,13708380, 0, 0 > zio_data_buf_45056: 45056, 0, 993, 41,13875971, 0, 0 > zio_buf_49152: 49152, 0, 569, 43,15518175, 0, 0 > zio_data_buf_49152: 49152, 0, 929, 32,12006369, 0, 0 > zio_buf_53248: 53248, 0, 594, 25,14752074, 0, 0 > zio_data_buf_53248: 53248, 0, 889, 30,11159838, 0, 0 > zio_buf_57344: 57344, 0, 536, 46,16314266, 0, 0 > zio_data_buf_57344: 57344, 0, 1105, 12,10210025, 0, 0 > zio_buf_61440: 61440, 0, 527, 43,14355397, 0, 0 > zio_data_buf_61440: 61440, 0, 738, 10, 9080556, 0, 0 > zio_buf_65536: 65536, 0, 447, 44,13264282, 0, 0 > zio_data_buf_65536: 65536, 0, 723, 16, 8855438, 0, 0 > zio_buf_69632: 69632, 0, 434, 35,10357799, 0, 0 > zio_data_buf_69632: 69632, 0, 675, 44, 8017072, 0, 0 > zio_buf_73728: 73728, 0, 441, 24, 9784965, 0, 0 > zio_data_buf_73728: 73728, 0, 650, 35, 7370868, 0, 0 > zio_buf_77824: 77824, 0, 448, 26, 9643063, 0, 0 > zio_data_buf_77824: 77824, 0, 802, 34, 7733636, 0, 0 > zio_buf_81920: 81920, 0, 393, 48, 8958739, 0, 0 > zio_data_buf_81920: 81920, 0, 671, 10, 6437432, 0, 0 > zio_buf_86016: 86016, 0, 397, 24, 8406339, 0, 0 > zio_data_buf_86016: 86016, 0, 458, 14, 5752942, 0, 0 > zio_buf_90112: 90112, 0, 337, 19, 9427445, 0, 0 > zio_data_buf_90112: 90112, 0, 629, 14, 6209404, 0, 0 > zio_buf_94208: 94208, 0, 342, 18, 9703869, 0, 0 > zio_data_buf_94208: 94208, 0, 471, 32, 5147136, 0, 0 > zio_buf_98304: 98304, 0, 335, 22,11366122, 0, 0 > zio_data_buf_98304: 98304, 0, 813, 13, 5071769, 0, 0 > zio_buf_102400: 102400, 0, 318, 35,10730116, 0, 0 > zio_data_buf_102400: 102400, 0, 494, 15, 5120409, 0, 0 > zio_buf_106496: 106496, 0, 295, 25,11494927, 0, 0 > zio_data_buf_106496: 106496, 0, 441, 12, 4628043, 0, 0 > zio_buf_110592: 110592, 0, 277, 36,12261799, 0, 0 > zio_data_buf_110592: 110592, 0, 996, 8, 4655911, 0, 0 > zio_buf_114688: 114688, 0, 248, 28,13187629, 0, 0 > zio_data_buf_114688: 114688, 0, 367, 26, 4356168, 0, 0 > zio_buf_118784: 118784, 0, 248, 25,11526765, 0, 0 > zio_data_buf_118784: 118784, 0, 457, 16, 3997133, 0, 0 > zio_buf_122880: 122880, 0, 221, 18,13138310, 0, 0 > zio_data_buf_122880: 122880, 0, 440, 16, 4127363, 0, 0 > zio_buf_126976: 126976, 0, 225, 22,21080594, 0, 0 > zio_data_buf_126976: 126976, 0, 332, 23, 3611080, 0, 0 > zio_buf_131072: 131072, 0, 236, 768,260386880, 0, 0 > zio_data_buf_131072: 131072, 0, 235926, 17,201706301, 0, 0 > lz4_ctx: 16384, 0, 0, 22,870339248, 0, 0 > sa_cache: 80, 0, 1114682, 301918,377799679, 0, 0 > dnode_t: 752, 0, 4591384, 1276221,343600652, 0, 0 > dmu_buf_impl_t: 232, 0, 4193283, 4522906,1613603616, 0, 0 > arc_buf_hdr_t: 216, 0, 3636990, 1135188,1255686550, 0, 0 > arc_buf_t: 72, 0, 1517802, 983818,1342208723, 0, 0 > zil_lwb_cache: 192, 0, 59, 1301,28828585, 0, 0 > zfs_znode_cache: 368, 0, 1114682, 297778,377799679, 0, 0 > procdesc: 128, 0, 0, 0, 3, 0, 0 > pipe: 744, 0, 8, 197,30953268, 0, 0 > Mountpoints: 816, 0, 13, 82, 13, 0, 0 > ksiginfo: 112, 0, 1138, 2362, 1449794, 0, 0 > itimer: 352, 0, 0, 264, 4107, 0, 0 > pf mtags: 40, 0, 0, 0, 0, 0, 0 > pf states: 296, 500006, 275, 427, 2506195, 0, 0 > pf state keys: 88, 0, 378, 1602, 2878928, 0, 0 > pf source nodes: 136, 500018, 0, 0, 0, 0, 0 > pf table entries: 160, 200000, 17, 33, 34, 0, 0 > pf table counters: 64, 0, 0, 0, 0, 0, 0 > pf frags: 80, 0, 0, 0, 0, 0, 0 > pf frag entries: 32, 40000, 0, 0, 0, 0, 0 > pf state scrubs: 40, 0, 0, 0, 0, 0, 0 > KNOTE: 128, 0, 13343, 1568,2119230288, 0, 0 > socket: 728, 4192760, 31124, 1581,260689101, 0, 0 > ipq: 56, 225283, 0, 0, 0, 0, 0 > udp_inpcb: 400, 4192760, 46, 484,18539506, 0, 0 > udpcb: 24, 4192869, 46, 4296,18539506, 0, 0 > tcp_inpcb: 400, 4192760, 42550, 1050,241905139, 0, 0 > tcpcb: 1032, 4192761, 14734, 830,241905139, 0, 0 > tcptw: 80, 27800, 27800, 0,100020089,89206796, > 0 > syncache: 168, 15364, 0, 805,137341445, 0, 0 > hostcache: 136, 15370, 57, 233, 759, 0, 0 > sackhole: 32, 0, 0, 3125, 19180, 0, 0 > sctp_ep: 1400, 4192760, 0, 0, 0, 0, 0 > sctp_asoc: 2408, 40000, 0, 0, 0, 0, 0 > sctp_laddr: 48, 80012, 0, 0, 3, 0, 0 > sctp_raddr: 720, 80000, 0, 0, 0, 0, 0 > sctp_chunk: 136, 400026, 0, 0, 0, 0, 0 > sctp_readq: 104, 400026, 0, 0, 0, 0, 0 > sctp_stream_msg_out: 104, 400026, 0, 0, 0, 0, 0 > sctp_asconf: 40, 400000, 0, 0, 0, 0, 0 > sctp_asconf_ack: 48, 400060, 0, 0, 0, 0, 0 > udplite_inpcb: 400, 4192760, 0, 0, 0, 0, 0 > ripcb: 400, 4192760, 0, 60, 6, 0, 0 > unpcb: 240, 4192768, 1166, 1074, 244448, 0, 0 > rtentry: 200, 0, 8, 92, 8, 0, 0 > selfd: 56, 0, 2339, 3270,6167642044, 0, 0 > SWAPMETA: 288, 16336788, 0, 0, 0, 0, 0 > FFS inode: 168, 0, 1032, 1084, 1308978, 0, 0 > FFS1 dinode: 128, 0, 0, 0, 0, 0, 0 > FFS2 dinode: 256, 0, 1032, 1098, 1308978, 0, 0 > NCLNODE: 528, 0, 0, 0, 0, 0, 0 > > this is staticticts after script helped to reclaim memory. > > Here's top statistics: > > Mem: 19G Active, 20G Inact, 81G Wired, 59M Cache, 3308M Buf, 4918M Free > ARC: 66G Total, 6926M MFU, 54G MRU, 8069K Anon, 899M Header, 5129M Other > > > > Steven Hartland wrote >> This is likely spikes in uma zones used by ARC. >> >> The VM doesn't ever clean uma zones unless it hits a low memory >> condition, which explains why your little script helps. >> >> Check the output of vmstat -z to confirm. >> >> On 04/11/2014 11:47, Dmitriy Makarov wrote: >>> Hi Current, >>> >>> It seems like there is constant flow (leak) of memory from ARC to Inact >>> in FreeBSD 11.0-CURRENT #0 r273165. >>> >>> Normally, our system (FreeBSD 11.0-CURRENT #5 r260625) keeps ARC size >>> very close to vfs.zfs.arc_max: >>> >>> Mem: 16G Active, 324M Inact, 105G Wired, 1612M Cache, 3308M Buf, 1094M >>> Free >>> ARC: 88G Total, 2100M MFU, 78G MRU, 39M Anon, 2283M Header, 6162M Other >>> >>> >>> But after an upgrade to (FreeBSD 11.0-CURRENT #0 r273165) we observe >>> enormous numbers of Inact memory in the top: >>> >>> Mem: 21G Active, 45G Inact, 56G Wired, 357M Cache, 3308M Buf, 1654M Free >>> ARC: 42G Total, 6025M MFU, 30G MRU, 30M Anon, 819M Header, 5214M Other >>> >>> Funny thing is that when we manually allocate and release memory, using >>> simple python script: >>> >>> #!/usr/local/bin/python2.7 >>> >>> import sys >>> import time >>> >>> if len(sys.argv) != 2: >>> print "usage: fillmem >> >> " >>> sys.exit() >>> >>> count = int(sys.argv[1]) >>> >>> megabyte = (0,) * (1024 * 1024 / 8) >>> >>> data = megabyte * count >>> >>> as: >>> >>> # ./simple_script 10000 >>> >>> all those allocated megabyes 'migrate' from Inact to Free, and afterwards >>> they are 'eaten' by ARC with no problem. >>> Until Inact slowly grows back to the number it was before we ran the >>> script. >>> >>> Current workaround is to periodically invoke this python script by cron. >>> This is an ugly workaround and we really don't like it on our production >>> >>> >>> To answer possible questions about ARC efficience: >>> Cache efficiency drops dramatically with every GiB pushed off the ARC. >>> >>> Before upgrade: >>> Cache Hit Ratio: 99.38% >>> >>> After upgrade: >>> Cache Hit Ratio: 81.95% >>> >>> We believe that ARC misbehaves and we ask your assistance. >>> >>> >>> ---------------------------------- >>> >>> Some values from configs. >>> >>> HW: 128GB RAM, LSI HBA controller with 36 disks (stripe of mirrors). >>> >>> top output: >>> >>> In /boot/loader.conf : >>> vm.kmem_size="110G" >>> vfs.zfs.arc_max="90G" >>> vfs.zfs.arc_min="42G" >>> vfs.zfs.txg.timeout="10" >>> >>> ----------------------------------- >>> >>> Thanks. >>> >>> Regards, >>> Dmitriy >>> _______________________________________________ >>> > >> freebsd-current@ > >> mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-current >>> To unsubscribe, send any mail to " > >> freebsd-current-unsubscribe@ > >> " >> >> _______________________________________________ > >> freebsd-current@ > >> mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to " > >> freebsd-current-unsubscribe@ > >> " > > > > > > -- > View this message in context: http://freebsd.1045724.n5.nabble.com/r273165-ZFS-ARC-possible-memory-leak-to-Inact-tp5962410p5962421.html > Sent from the freebsd-current mailing list archive at Nabble.com. > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > Justin Gibbs and I were helping George from Voxer look at the same issue they are having. They had ~169GB in inact, and only ~60GB being used for ARC. Are there any further debugging steps we can recommend to him to help investigate this? -- Allan Jude