Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Feb 2012 14:39:42 +0200
From:      "Pavlo" <devgs@ukr.net>
To:        "George Kontostanos" <gkontos.mail@gmail.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS and mem management
Message-ID:  <96280.1329309582.18313701080496209920@ffe5.ukr.net>
In-Reply-To: <CA%2BdUSyoRyC5O4D8tQQ9iDH2E87P3cxW_Ay8b6FuvzzbORpSYhA@mail.gmail.com>
References:  <15861.1329298812.1414986334451204096@ffe12.ukr.net> <92617.1329301696.6338962447434776576@ffe5.ukr.net> <CA%2BdUSyqKoqrfD_cgsfupsuZRE0O6dH-4F1roLp_GFaeBKJkN-w@mail.gmail.com> <CA%2BdUSyoRyC5O4D8tQQ9iDH2E87P3cxW_Ay8b6FuvzzbORpSYhA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help



Unfortunately we can't afford disabling prefetch. It is too much of an
overhead.

Also I made some tests. I have process that maps file using mmap() and
writes or reads first byte of each page of mapped file with some data.

There is 8Gb Ram on machine.

Test 1:
Tool maps 1.5GB file, writes first byte of each page with some random
data. Wired memory get filled (with file cache ?) virtual address size of
a process is 1.5Gb while RES is ~20mb. Not closing this tool I ask it to
write data to each page again: Now Active memory get filled while Wired
still of same size. Now I ask my next tool to allocate 6Gb of memory. It
gets 5.8Gb and hangs in pagefault, sleeps there for about 10 seconds and
gets killed out of swap. After 'memory eater' is killed I see 900mb of
memory still in Active and that matches the RES size of the first tool
(it been reduced).

I suppose 900mb is memory that had no time to be flushed to file and give
out free pages when allocator killed 'memory eater', however it had time
to squeeze 600mb RAM out of first tool.

But mostly I see 1.5Gb of Active RAM afterwards. What means even though
we have 1.5Gb of memory that can be easily flushed back to the file that
didn't happens (it always happens for Linux for example), 'memory eater'
just hangs in pfault and later gets killed. Sometime this happens even
after first tool is done it's job and unmaped file. 'Frozen' 1.5 Gb of
Active memory still here. I want to say it is actually reusable if I run
first tool again i.e. pages got recalimed.

Test 2:
Assuming the possibility of busyness of FS I will try only reading
operations using mmap();
Case 1: tool does 2 runs through mmaped memory reading first byte of each
page.

After second run RES size gets almost equal to virtual address size i.e.
almost every page was mapped into RAM. I use my 'memory eating' tool and
ask for 6Gb again. After short hang in pfault it gets what I asked. While
first tool's RES size is dramatically reduced. That's what I wanted.

Case 2: tool does 10+ runs through mmaped memory reading first byte of
each page.

First time I run 'memory eater' sometimes it gets killed as in test 1 and
sometimes it shares some pages.

I can't understand where to dig. When RAM contains pages that are being
only red it is not a problem to free them but sometimes it doesn't
happen. I repeat again, even though Linux is differ so much from FreeBSD
it always does 'right' thing: flushes pages and provide memory. Well at
least I believe that is right thing.

Thanks.
>

2012/2/15 Pavlo <dev gs@ukr.net>:
>
> Hey George,
>
> thanks for quick response.
>
> No, no dedup is used.
>
> zfs-stats -a :
>
> ------------------------------------------------------------------------
> ZFS Subsystem Report                Wed Feb 15 12:26:18 2012
>
> ------------------------------------------------------------------------
>
> System Information:
>
>     Kernel Version:                802516 (osreldate)
>     Hardware Platform:            amd64
>     Processor Architecture:            amd64
>
>     ZFS Storage pool Version:        28
>     ZFS Filesystem Version:            5
>
> FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root
> 12:26PM  up  2:29, 7 users, load averages: 0.02, 0.16, 0.16
>
> ------------------------------------------------------------------------
>
> System Memory:
>
>     19.78%    1.53    GiB Active,    0.95%    75.21    MiB Inact
>     36.64%    2.84    GiB Wired,    0.06%    4.83    MiB Cache
>     42.56%    3.30    GiB Free,    0.01%    696.00    KiB Gap
>
>
>     Real Installed:                8.00    GiB
>     Real Available:            99.84%    7.99    GiB
>     Real Managed:            96.96%    7.74    GiB
>
>     Logical Total:                8.00    GiB
>     Logical Used:            57.82%    4.63    GiB
>     Logical Free:            42.18%    3.37    GiB
>
> Kernel Memory:                    2.43    GiB
>     Data:                99.54%    2.42    GiB
>     Text:                0.46%    11.50    MiB
>
> Kernel Memory Map:                3.16    GiB
>     Size:                69.69%    2.20    GiB
>     Free:                30.31%    979.48    MiB
>
> ------------------------------------------------------------------------
>
> ARC Summary: (THROTTLED)
>     Memory Throttle Count:            3.82k
>
> ARC Misc:
>     Deleted:                874.34k
>     Recycle Misses:                376.12k
>     Mutex Misses:                4.74k
>     Evict Skips:                4.74k
>
> ARC Size:                68.53%    2.34    GiB
>     Target Size: (Adaptive)        68.54%    2.34    GiB
>     Min Size (Hard Limit):        12.50%    437.50    MiB
>     Max Size (High Water):        8:1    3.42    GiB
>
> ARC Size Breakdown:
>     Recently Used Cache Size:    92.95%    2.18    GiB
>     Frequently Used Cache Size:    7.05%    169.01    MiB
>
> ARC Hash Breakdown:
>     Elements Max:                229.96k
>     Elements Current:        40.05%    92.10k
>     Collisions:                705.52k
>     Chain Max:                11
>     Chains:                    20.64k
>
> ------------------------------------------------------------------------
>
> ARC Efficiency:                    7.96m
>     Cache Hit Ratio:        84.92%    6.76m
>     Cache Miss Ratio:        15.08%    1.20m
>     Actual Hit Ratio:        76.29%    6.08m
>
>     Data Demand Efficiency:        91.32%    4.99m
>     Data Prefetch Efficiency:    19.57%    134.19k
>
>     CACHE HITS BY CACHE LIST:
>       Anonymously Used:        7.24%    489.41k
>       Most Recently Used:        25.29%    1.71m
>       Most Frequently Used:        64.54%    4.37m
>       Most Recently Used Ghost:    1.42%    95.77k
>       Most Frequently Used Ghost:    1.51%    102.33k
>
>     CACHE HITS BY DATA TYPE:
>       Demand Data:            67.42%    4.56m
>       Prefetch Data:        0.39%    26.26k
>       Demand Metadata:        22.41%    1.52m
>       Prefetch Metadata:        9.78%    661.25k
>
>     CACHE MISSES BY DATA TYPE:
>       Demand Data:            36.11%    433.60k
>       Prefetch Data:        8.99%    107.94k
>       Demand Metadata:        32.00%    384.29k
>       Prefetch Metadata:        22.91%    275.09k
>
> ------------------------------------------------------------------------
>
> L2ARC is disabled
>
> ------------------------------------------------------------------------
>
> File-Level Prefetch: (HEALTHY)
>
> DMU Efficiency:                    26.49m
>     Hit Ratio:            71.64%    18.98m
>     Miss Ratio:            28.36%    7.51m
>
>     Colinear:                7.51m
>       Hit Ratio:            0.02%    1.42k
>       Miss Ratio:            99.98%    7.51m
>
>     Stride:                    18.85m
>       Hit Ratio:            99.97%    18.85m
>       Miss Ratio:            0.03%    5.73k
>
> DMU Misc:
>     Reclaim:                7.51m
>       Successes:            0.29%    21.58k
>       Failures:            99.71%    7.49m
>
>     Streams:                130.46k
>       +Resets:            0.35%    461
>       -Resets:            99.65%    130.00k
>       Bogus:                0
>
> ------------------------------------------------------------------------
>
> VDEV cache is disabled
>
> ------------------------------------------------------------------------
>
> ZFS Tunables (sysctl):
>     kern.maxusers                           384
>     vm.kmem_size                            4718592000
>     vm.kmem_size_scale                      1
>     vm.kmem_size_min                        0
>     vm.kmem_size_max                        329853485875
>     vfs.zfs.l2c_only_size                   0
>     vfs.zfs.mfu_ghost_data_lsize            2705408
>     vfs.zfs.mfu_ghost_metadata_lsize        332861440
>     vfs.zfs.mfu_ghost_size                  335566848
>     vfs.zfs.mfu_data_lsize                  1641984
>     vfs.zfs.mfu_metadata_lsize              3048448
>     vfs.zfs.mfu_size                        28561920
>     vfs.zfs.mru_ghost_data_lsize            68477440
>     vfs.zfs.mru_ghost_metadata_lsize        62875648
>     vfs.zfs.mru_ghost_size                  131353088
>     vfs.zfs.mru_data_lsize                  1651216384
>     vfs.zfs.mru_metadata_lsize              278577152
>     vfs.zfs.mru_size                        2306510848
>     vfs.zfs.anon_data_lsize                 0
>     vfs.zfs.anon_metadata_lsize             0
>     vfs.zfs.anon_size                       12968960
>     vfs.zfs.l2arc_norw                      1
>     vfs.zfs.l2arc_feed_again                1
>     vfs.zfs.l2arc_noprefetch                1
>     vfs.zfs.l2arc_feed_min_ms               200
>     vfs.zfs.l2arc_feed_secs                 1
>     vfs.zfs.l2arc_headroom                  2
>     vfs.zfs.l2arc_write_boost               8388608
>     vfs.zfs.l2arc_write_max                 8388608
>     vfs.zfs.arc_meta_limit                  917504000
>     vfs.zfs.arc_meta_used                   851157616
>     vfs.zfs.arc_min                         458752000
>     vfs.zfs.arc_max                         3670016000
>     vfs.zfs.dedup.prefetch                  1
>     vfs.zfs.mdcomp_disable                  0
>     vfs.zfs.write_limit_override            1048576000
>     vfs.zfs.write_limit_inflated            25728073728
>     vfs.zfs.write_limit_max                 1072003072
>     vfs.zfs.write_limit_min                 33554432
>     vfs.zfs.write_limit_shift               3
>     vfs.zfs.no_write_throttle               0
>     vfs.zfs.zfetch.array_rd_sz              1048576
>     vfs.zfs.zfetch.block_cap                256
>     vfs.zfs.zfetch.min_sec_reap             2
>     vfs.zfs.zfetch.max_streams              8
>     vfs.zfs.prefetch_disable                0
>     vfs.zfs.mg_alloc_failures               8
>     vfs.zfs.check_hostid                    1
>     vfs.zfs.recover                         0
>     vfs.zfs.txg.synctime_ms                 1000
>     vfs.zfs.txg.timeout                     10
>     vfs.zfs.scrub_limit                     10
>     vfs.zfs.vdev.cache.bshift               16
>     vfs.zfs.vdev.cache.size                 0
>     vfs.zfs.vdev.cache.max                  16384
>     vfs.zfs.vdev.write_gap_limit            4096
>     vfs.zfs.vdev.read_gap_limit             32768
>     vfs.zfs.vdev.aggregation_limit          131072
>     vfs.zfs.vdev.ramp_rate                  2
>     vfs.zfs.vdev.time_shift                 6
>     vfs.zfs.vdev.min_pending                4
>     vfs.zfs.vdev.max_pending                10
>     vfs.zfs.vdev.bio_flush_disable          0
>     vfs.zfs.cache_flush_disable             0
>     vfs.zfs.zil_replay_disable              0
>     vfs.zfs.zio.use_uma                     0
>     vfs.zfs.version.zpl                     5
>     vfs.zfs.version.spa                     28
>     vfs.zfs.version.acl                     1
>     vfs.zfs.debug                           0
>     vfs.zfs.super_owner                     0
>
> ------------------------------------------------------------------------

I see that you are limiting your arc.max to 3G but you have prefetch enabled.

You can try disabling this:

vfs.zfs.prefetch_disable=1

If things turn out better you can increase your arc.mac to 4G

Regards
-- 
George Kontostanos
Aicom telecoms ltdhttp://www.aisecure.net



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?96280.1329309582.18313701080496209920>