From owner-freebsd-fs@FreeBSD.ORG Wed Feb 15 12:39:45 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DE47106567B for ; Wed, 15 Feb 2012 12:39:45 +0000 (UTC) (envelope-from devgs@ukr.net) Received: from ffe5.ukr.net (ffe5.ukr.net [195.214.192.21]) by mx1.freebsd.org (Postfix) with ESMTP id 892878FC0A for ; Wed, 15 Feb 2012 12:39:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ukr.net; s=ffe; h=Date:Message-Id:From:To:References:In-Reply-To:Subject:Cc:Content-Type:Content-Transfer-Encoding:MIME-Version; bh=aZdhtwznpvldMPlym3Lau9NHyuKkeZS1U0Li3b0O6+c=; b=QvX/B9UJJ/b7uED9QjqKk9ef8+DjXY97Z/U2C7dOFutWQLwXSQ2W8HKFQxOXlEbBxSLLYuMqA5SEjmuQrdTuiDsy/pqKTsJ5Ht0QsymymFvIWBoqB0thwzowO+nEeweHVtgXmWoDRnBIDvlXQeZU6mQvGPvgpIEAEtRQ9Ilypcw=; Received: from mail by ffe5.ukr.net with local ID 1Rxe98-000P3a-Bh ; Wed, 15 Feb 2012 14:39:43 +0200 MIME-Version: 1.0 In-Reply-To: References: <15861.1329298812.1414986334451204096@ffe12.ukr.net> <92617.1329301696.6338962447434776576@ffe5.ukr.net> To: "George Kontostanos" From: "Pavlo" X-Mailer: freemail.ukr.net 4.0 X-Originating-Ip: [212.42.94.154] Message-Id: <96280.1329309582.18313701080496209920@ffe5.ukr.net> X-Browser: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:10.0.1) Gecko/20100101 Firefox/10.0.1 Date: Wed, 15 Feb 2012 14:39:42 +0200 Content-Type: text/plain; charset="windows-1251" Content-Transfer-Encoding: binary Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and mem management X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Feb 2012 12:39:45 -0000 Unfortunately we can't afford disabling prefetch. It is too much of an overhead. Also I made some tests. I have process that maps file using mmap() and writes or reads first byte of each page of mapped file with some data. There is 8Gb Ram on machine. Test 1: Tool maps 1.5GB file, writes first byte of each page with some random data. Wired memory get filled (with file cache ?) virtual address size of a process is 1.5Gb while RES is ~20mb. Not closing this tool I ask it to write data to each page again: Now Active memory get filled while Wired still of same size. Now I ask my next tool to allocate 6Gb of memory. It gets 5.8Gb and hangs in pagefault, sleeps there for about 10 seconds and gets killed out of swap. After 'memory eater' is killed I see 900mb of memory still in Active and that matches the RES size of the first tool (it been reduced). I suppose 900mb is memory that had no time to be flushed to file and give out free pages when allocator killed 'memory eater', however it had time to squeeze 600mb RAM out of first tool. But mostly I see 1.5Gb of Active RAM afterwards. What means even though we have 1.5Gb of memory that can be easily flushed back to the file that didn't happens (it always happens for Linux for example), 'memory eater' just hangs in pfault and later gets killed. Sometime this happens even after first tool is done it's job and unmaped file. 'Frozen' 1.5 Gb of Active memory still here. I want to say it is actually reusable if I run first tool again i.e. pages got recalimed. Test 2: Assuming the possibility of busyness of FS I will try only reading operations using mmap(); Case 1: tool does 2 runs through mmaped memory reading first byte of each page. After second run RES size gets almost equal to virtual address size i.e. almost every page was mapped into RAM. I use my 'memory eating' tool and ask for 6Gb again. After short hang in pfault it gets what I asked. While first tool's RES size is dramatically reduced. That's what I wanted. Case 2: tool does 10+ runs through mmaped memory reading first byte of each page. First time I run 'memory eater' sometimes it gets killed as in test 1 and sometimes it shares some pages. I can't understand where to dig. When RAM contains pages that are being only red it is not a problem to free them but sometimes it doesn't happen. I repeat again, even though Linux is differ so much from FreeBSD it always does 'right' thing: flushes pages and provide memory. Well at least I believe that is right thing. Thanks. > 2012/2/15 Pavlo : > > Hey George, > > thanks for quick response. > > No, no dedup is used. > > zfs-stats -a : > > ------------------------------------------------------------------------ > ZFS Subsystem Report Wed Feb 15 12:26:18 2012 > > ------------------------------------------------------------------------ > > System Information: > > Kernel Version: 802516 (osreldate) > Hardware Platform: amd64 > Processor Architecture: amd64 > > ZFS Storage pool Version: 28 > ZFS Filesystem Version: 5 > > FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root > 12:26PM up 2:29, 7 users, load averages: 0.02, 0.16, 0.16 > > ------------------------------------------------------------------------ > > System Memory: > > 19.78% 1.53 GiB Active, 0.95% 75.21 MiB Inact > 36.64% 2.84 GiB Wired, 0.06% 4.83 MiB Cache > 42.56% 3.30 GiB Free, 0.01% 696.00 KiB Gap > > > Real Installed: 8.00 GiB > Real Available: 99.84% 7.99 GiB > Real Managed: 96.96% 7.74 GiB > > Logical Total: 8.00 GiB > Logical Used: 57.82% 4.63 GiB > Logical Free: 42.18% 3.37 GiB > > Kernel Memory: 2.43 GiB > Data: 99.54% 2.42 GiB > Text: 0.46% 11.50 MiB > > Kernel Memory Map: 3.16 GiB > Size: 69.69% 2.20 GiB > Free: 30.31% 979.48 MiB > > ------------------------------------------------------------------------ > > ARC Summary: (THROTTLED) > Memory Throttle Count: 3.82k > > ARC Misc: > Deleted: 874.34k > Recycle Misses: 376.12k > Mutex Misses: 4.74k > Evict Skips: 4.74k > > ARC Size: 68.53% 2.34 GiB > Target Size: (Adaptive) 68.54% 2.34 GiB > Min Size (Hard Limit): 12.50% 437.50 MiB > Max Size (High Water): 8:1 3.42 GiB > > ARC Size Breakdown: > Recently Used Cache Size: 92.95% 2.18 GiB > Frequently Used Cache Size: 7.05% 169.01 MiB > > ARC Hash Breakdown: > Elements Max: 229.96k > Elements Current: 40.05% 92.10k > Collisions: 705.52k > Chain Max: 11 > Chains: 20.64k > > ------------------------------------------------------------------------ > > ARC Efficiency: 7.96m > Cache Hit Ratio: 84.92% 6.76m > Cache Miss Ratio: 15.08% 1.20m > Actual Hit Ratio: 76.29% 6.08m > > Data Demand Efficiency: 91.32% 4.99m > Data Prefetch Efficiency: 19.57% 134.19k > > CACHE HITS BY CACHE LIST: > Anonymously Used: 7.24% 489.41k > Most Recently Used: 25.29% 1.71m > Most Frequently Used: 64.54% 4.37m > Most Recently Used Ghost: 1.42% 95.77k > Most Frequently Used Ghost: 1.51% 102.33k > > CACHE HITS BY DATA TYPE: > Demand Data: 67.42% 4.56m > Prefetch Data: 0.39% 26.26k > Demand Metadata: 22.41% 1.52m > Prefetch Metadata: 9.78% 661.25k > > CACHE MISSES BY DATA TYPE: > Demand Data: 36.11% 433.60k > Prefetch Data: 8.99% 107.94k > Demand Metadata: 32.00% 384.29k > Prefetch Metadata: 22.91% 275.09k > > ------------------------------------------------------------------------ > > L2ARC is disabled > > ------------------------------------------------------------------------ > > File-Level Prefetch: (HEALTHY) > > DMU Efficiency: 26.49m > Hit Ratio: 71.64% 18.98m > Miss Ratio: 28.36% 7.51m > > Colinear: 7.51m > Hit Ratio: 0.02% 1.42k > Miss Ratio: 99.98% 7.51m > > Stride: 18.85m > Hit Ratio: 99.97% 18.85m > Miss Ratio: 0.03% 5.73k > > DMU Misc: > Reclaim: 7.51m > Successes: 0.29% 21.58k > Failures: 99.71% 7.49m > > Streams: 130.46k > +Resets: 0.35% 461 > -Resets: 99.65% 130.00k > Bogus: 0 > > ------------------------------------------------------------------------ > > VDEV cache is disabled > > ------------------------------------------------------------------------ > > ZFS Tunables (sysctl): > kern.maxusers 384 > vm.kmem_size 4718592000 > vm.kmem_size_scale 1 > vm.kmem_size_min 0 > vm.kmem_size_max 329853485875 > vfs.zfs.l2c_only_size 0 > vfs.zfs.mfu_ghost_data_lsize 2705408 > vfs.zfs.mfu_ghost_metadata_lsize 332861440 > vfs.zfs.mfu_ghost_size 335566848 > vfs.zfs.mfu_data_lsize 1641984 > vfs.zfs.mfu_metadata_lsize 3048448 > vfs.zfs.mfu_size 28561920 > vfs.zfs.mru_ghost_data_lsize 68477440 > vfs.zfs.mru_ghost_metadata_lsize 62875648 > vfs.zfs.mru_ghost_size 131353088 > vfs.zfs.mru_data_lsize 1651216384 > vfs.zfs.mru_metadata_lsize 278577152 > vfs.zfs.mru_size 2306510848 > vfs.zfs.anon_data_lsize 0 > vfs.zfs.anon_metadata_lsize 0 > vfs.zfs.anon_size 12968960 > vfs.zfs.l2arc_norw 1 > vfs.zfs.l2arc_feed_again 1 > vfs.zfs.l2arc_noprefetch 1 > vfs.zfs.l2arc_feed_min_ms 200 > vfs.zfs.l2arc_feed_secs 1 > vfs.zfs.l2arc_headroom 2 > vfs.zfs.l2arc_write_boost 8388608 > vfs.zfs.l2arc_write_max 8388608 > vfs.zfs.arc_meta_limit 917504000 > vfs.zfs.arc_meta_used 851157616 > vfs.zfs.arc_min 458752000 > vfs.zfs.arc_max 3670016000 > vfs.zfs.dedup.prefetch 1 > vfs.zfs.mdcomp_disable 0 > vfs.zfs.write_limit_override 1048576000 > vfs.zfs.write_limit_inflated 25728073728 > vfs.zfs.write_limit_max 1072003072 > vfs.zfs.write_limit_min 33554432 > vfs.zfs.write_limit_shift 3 > vfs.zfs.no_write_throttle 0 > vfs.zfs.zfetch.array_rd_sz 1048576 > vfs.zfs.zfetch.block_cap 256 > vfs.zfs.zfetch.min_sec_reap 2 > vfs.zfs.zfetch.max_streams 8 > vfs.zfs.prefetch_disable 0 > vfs.zfs.mg_alloc_failures 8 > vfs.zfs.check_hostid 1 > vfs.zfs.recover 0 > vfs.zfs.txg.synctime_ms 1000 > vfs.zfs.txg.timeout 10 > vfs.zfs.scrub_limit 10 > vfs.zfs.vdev.cache.bshift 16 > vfs.zfs.vdev.cache.size 0 > vfs.zfs.vdev.cache.max 16384 > vfs.zfs.vdev.write_gap_limit 4096 > vfs.zfs.vdev.read_gap_limit 32768 > vfs.zfs.vdev.aggregation_limit 131072 > vfs.zfs.vdev.ramp_rate 2 > vfs.zfs.vdev.time_shift 6 > vfs.zfs.vdev.min_pending 4 > vfs.zfs.vdev.max_pending 10 > vfs.zfs.vdev.bio_flush_disable 0 > vfs.zfs.cache_flush_disable 0 > vfs.zfs.zil_replay_disable 0 > vfs.zfs.zio.use_uma 0 > vfs.zfs.version.zpl 5 > vfs.zfs.version.spa 28 > vfs.zfs.version.acl 1 > vfs.zfs.debug 0 > vfs.zfs.super_owner 0 > > ------------------------------------------------------------------------ I see that you are limiting your arc.max to 3G but you have prefetch enabled. You can try disabling this: vfs.zfs.prefetch_disable=1 If things turn out better you can increase your arc.mac to 4G Regards -- George Kontostanos Aicom telecoms ltdhttp://www.aisecure.net