Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Feb 2012 12:36:41 +0100
From:      Peter Maloney <peter.maloney@brockmann-consult.de>
To:        freebsd-fs@freebsd.org
Subject:   Re: ZFS and mem management
Message-ID:  <4F3B98C9.1090400@brockmann-consult.de>
In-Reply-To: <92617.1329301696.6338962447434776576@ffe5.ukr.net>
References:  <15861.1329298812.1414986334451204096@ffe12.ukr.net>	<CA%2BdUSyqKoqrfD_cgsfupsuZRE0O6dH-4F1roLp_GFaeBKJkN-w@mail.gmail.com> <92617.1329301696.6338962447434776576@ffe5.ukr.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Can you also post:

zpool get all <poolname>


And does your indexing scan through the .zfs/snapshot directory? If so,
this is a known issue that totally eats your memory, resulting in swap
space errors.



On 02/15/2012 11:28 AM, Pavlo wrote:
>
>
> Hey George,
>
> thanks for quick response.
>
> No, no dedup is used.
>
> zfs-stats -a :
>
> ------------------------------------------------------------------------
> ZFS Subsystem Report    Wed Feb 15 12:26:18 2012
> ------------------------------------------------------------------------
>
> System Information:
>
> Kernel Version:    802516 (osreldate)
> Hardware Platform:    amd64
> Processor Architecture:    amd64
>
> ZFS Storage pool Version:    28
> ZFS Filesystem Version:    5
>
> FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root
> 12:26PM  up  2:29, 7 users, load averages: 0.02, 0.16, 0.16
>
> ------------------------------------------------------------------------
>
> System Memory:
>
> 19.78%    1.53    GiB Active,    0.95%    75.21    MiB Inact
> 36.64%    2.84    GiB Wired,    0.06%    4.83    MiB Cache
> 42.56%    3.30    GiB Free,    0.01%    696.00    KiB Gap
>
> Real Installed:    8.00    GiB
> Real Available:    99.84%    7.99    GiB
> Real Managed:    96.96%    7.74    GiB
>
> Logical Total:    8.00    GiB
> Logical Used:    57.82%    4.63    GiB
> Logical Free:    42.18%    3.37    GiB
>
> Kernel Memory:    2.43    GiB
> Data:    99.54%    2.42    GiB
> Text:    0.46%    11.50    MiB
>
> Kernel Memory Map:    3.16    GiB
> Size:    69.69%    2.20    GiB
> Free:    30.31%    979.48    MiB
>
> ------------------------------------------------------------------------
>
> ARC Summary: (THROTTLED)
> Memory Throttle Count:    3.82k
>
> ARC Misc:
> Deleted:    874.34k
> Recycle Misses:    376.12k
> Mutex Misses:    4.74k
> Evict Skips:    4.74k
>
> ARC Size:    68.53%    2.34    GiB
> Target Size: (Adaptive)    68.54%    2.34    GiB
> Min Size (Hard Limit):    12.50%    437.50    MiB
> Max Size (High Water):    8:1    3.42    GiB
>
> ARC Size Breakdown:
> Recently Used Cache Size:    92.95%    2.18    GiB
> Frequently Used Cache Size:    7.05%    169.01    MiB
>
> ARC Hash Breakdown:
> Elements Max:    229.96k
> Elements Current:    40.05%    92.10k
> Collisions:    705.52k
> Chain Max:    11
> Chains:    20.64k
>
> ------------------------------------------------------------------------
>
> ARC Efficiency:    7.96m
> Cache Hit Ratio:    84.92%    6.76m
> Cache Miss Ratio:    15.08%    1.20m
> Actual Hit Ratio:    76.29%    6.08m
>
> Data Demand Efficiency:    91.32%    4.99m
> Data Prefetch Efficiency:    19.57%    134.19k
>
> CACHE HITS BY CACHE LIST:
> Anonymously Used:    7.24%    489.41k
> Most Recently Used:    25.29%    1.71m
> Most Frequently Used:    64.54%    4.37m
> Most Recently Used Ghost:    1.42%    95.77k
> Most Frequently Used Ghost:    1.51%    102.33k
>
> CACHE HITS BY DATA TYPE:
> Demand Data:    67.42%    4.56m
> Prefetch Data:    0.39%    26.26k
> Demand Metadata:    22.41%    1.52m
> Prefetch Metadata:    9.78%    661.25k
>
> CACHE MISSES BY DATA TYPE:
> Demand Data:    36.11%    433.60k
> Prefetch Data:    8.99%    107.94k
> Demand Metadata:    32.00%    384.29k
> Prefetch Metadata:    22.91%    275.09k
>
> ------------------------------------------------------------------------
>
> L2ARC is disabled
>
> ------------------------------------------------------------------------
>
> File-Level Prefetch: (HEALTHY)
>
> DMU Efficiency:    26.49m
> Hit Ratio:    71.64%    18.98m
> Miss Ratio:    28.36%    7.51m
>
> Colinear:    7.51m
> Hit Ratio:    0.02%    1.42k
> Miss Ratio:    99.98%    7.51m
>
> Stride:    18.85m
> Hit Ratio:    99.97%    18.85m
> Miss Ratio:    0.03%    5.73k
>
> DMU Misc:
> Reclaim:    7.51m
> Successes:    0.29%    21.58k
> Failures:    99.71%    7.49m
>
> Streams:    130.46k
> +Resets:    0.35%    461
> -Resets:    99.65%    130.00k
> Bogus:    0
>
> ------------------------------------------------------------------------
>
> VDEV cache is disabled
>
> ------------------------------------------------------------------------
>
> ZFS Tunables (sysctl):
> kern.maxusers                           384
> vm.kmem_size                            4718592000
> vm.kmem_size_scale                      1
> vm.kmem_size_min                        0
> vm.kmem_size_max                        329853485875
> vfs.zfs.l2c_only_size                   0
> vfs.zfs.mfu_ghost_data_lsize            2705408
> vfs.zfs.mfu_ghost_metadata_lsize        332861440
> vfs.zfs.mfu_ghost_size                  335566848
> vfs.zfs.mfu_data_lsize                  1641984
> vfs.zfs.mfu_metadata_lsize              3048448
> vfs.zfs.mfu_size                        28561920
> vfs.zfs.mru_ghost_data_lsize            68477440
> vfs.zfs.mru_ghost_metadata_lsize        62875648
> vfs.zfs.mru_ghost_size                  131353088
> vfs.zfs.mru_data_lsize                  1651216384
> vfs.zfs.mru_metadata_lsize              278577152
> vfs.zfs.mru_size                        2306510848
> vfs.zfs.anon_data_lsize                 0
> vfs.zfs.anon_metadata_lsize             0
> vfs.zfs.anon_size                       12968960
> vfs.zfs.l2arc_norw                      1
> vfs.zfs.l2arc_feed_again                1
> vfs.zfs.l2arc_noprefetch                1
> vfs.zfs.l2arc_feed_min_ms               200
> vfs.zfs.l2arc_feed_secs                 1
> vfs.zfs.l2arc_headroom                  2
> vfs.zfs.l2arc_write_boost               8388608
> vfs.zfs.l2arc_write_max                 8388608
> vfs.zfs.arc_meta_limit                  917504000
> vfs.zfs.arc_meta_used                   851157616
> vfs.zfs.arc_min                         458752000
> vfs.zfs.arc_max                         3670016000
> vfs.zfs.dedup.prefetch                  1
> vfs.zfs.mdcomp_disable                  0
> vfs.zfs.write_limit_override            1048576000
> vfs.zfs.write_limit_inflated            25728073728
> vfs.zfs.write_limit_max                 1072003072
> vfs.zfs.write_limit_min                 33554432
> vfs.zfs.write_limit_shift               3
> vfs.zfs.no_write_throttle               0
> vfs.zfs.zfetch.array_rd_sz              1048576
> vfs.zfs.zfetch.block_cap                256
> vfs.zfs.zfetch.min_sec_reap             2
> vfs.zfs.zfetch.max_streams              8
> vfs.zfs.prefetch_disable                0
> vfs.zfs.mg_alloc_failures               8
> vfs.zfs.check_hostid                    1
> vfs.zfs.recover                         0
> vfs.zfs.txg.synctime_ms                 1000
> vfs.zfs.txg.timeout                     10
> vfs.zfs.scrub_limit                     10
> vfs.zfs.vdev.cache.bshift               16
> vfs.zfs.vdev.cache.size                 0
> vfs.zfs.vdev.cache.max                  16384
> vfs.zfs.vdev.write_gap_limit            4096
> vfs.zfs.vdev.read_gap_limit             32768
> vfs.zfs.vdev.aggregation_limit          131072
> vfs.zfs.vdev.ramp_rate                  2
> vfs.zfs.vdev.time_shift                 6
> vfs.zfs.vdev.min_pending                4
> vfs.zfs.vdev.max_pending                10
> vfs.zfs.vdev.bio_flush_disable          0
> vfs.zfs.cache_flush_disable             0
> vfs.zfs.zil_replay_disable              0
> vfs.zfs.zio.use_uma                     0
> vfs.zfs.version.zpl                     5
> vfs.zfs.version.spa                     28
> vfs.zfs.version.acl                     1
> vfs.zfs.debug                           0
> vfs.zfs.super_owner                     0
>
> ------------------------------------------------------------------------
>
>
>
>
>
> 2012/2/15 Pavlo <devgs@ukr.net>:
>>
>>
>> Hello.
>>
>> We have an issue with memory management on FreeBSD and i suspect it is
>> related to FS.
>> We are using ZFS, here quick stats:
>>
>>
>> zpool status
>> pool: disk1
>> state: ONLINE
>> scan: resilvered 657G in 8h30m with 0 errors on Tue Feb 14 21:17:37 2012
>> config:
>>
>> NAME            STATE     READ WRITE CKSUM
>> disk1           ONLINE       0     0     0
>> mirror-0      ONLINE       0     0     0
>> gpt/disk0   ONLINE       0     0     0
>> gpt/disk1   ONLINE       0     0     0
>> gpt/disk2     ONLINE       0     0     0
>> gpt/disk4     ONLINE       0     0     0
>> gpt/disk6     ONLINE       0     0     0
>> gpt/disk8     ONLINE       0     0     0
>> gpt/disk10    ONLINE       0     0     0
>> gpt/disk12    ONLINE       0     0     0
>> mirror-7      ONLINE       0     0     0
>> gpt/disk14  ONLINE       0     0     0
>> gpt/disk15  ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>> pool: zroot
>> state: ONLINE
>> scan: resilvered 34.9G in 0h11m with 0 errors on Tue Feb 14 12:57:52 2012
>> config:
>>
>> NAME          STATE     READ WRITE CKSUM
>> zroot         ONLINE       0     0     0
>> mirror-0    ONLINE       0     0     0
>> gpt/sys0  ONLINE       0     0     0
>> gpt/sys1  ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>> ------------------------------------------------------------------------
>>
>> System Memory:
>>
>> 0.95%    75.61    MiB Active,    0.24%    19.02    MiB Inact
>> 18.25%    1.41    GiB Wired,    0.01%    480.00    KiB Cache
>> 80.54%    6.24    GiB Free,    0.01%    604.00    KiB Gap
>>
>> Real Installed:    8.00    GiB
>> Real Available:    99.84%    7.99    GiB
>> Real Managed:    96.96%    7.74    GiB
>>
>> Logical Total:    8.00    GiB
>> Logical Used:    21.79%    1.74    GiB
>> Logical Free:    78.21%    6.26    GiB
>>
>> Kernel Memory:    1.18    GiB
>> Data:    99.05%    1.17    GiB
>> Text:    0.95%    11.50    MiB
>>
>> Kernel Memory Map:    4.39    GiB
>> Size:    23.32%    1.02    GiB
>> Free:    76.68%    3.37    GiB
>>
>> ------------------------------------------------------------------------
>>
>> ------------------------------------------------------------------------
>> ZFS Subsystem Report    Wed Feb 15 10:53:03 2012
>> ------------------------------------------------------------------------
>>
>> System Information:
>>
>> Kernel Version:    802516 (osreldate)
>> Hardware Platform:    amd64
>> Processor Architecture:    amd64
>>
>> ZFS Storage pool Version:    28
>> ZFS Filesystem Version:    5
>>
>> FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root
>> 10:53AM  up 56 mins, 6 users, load averages: 0.00, 0.00, 0.00
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>> Background:
>> we are using some tool that does indexing of some data and then pushes it
>> into  database (currently bdb-5.2). Instances of indexer are running
>> continuously one after another. Time of indexing for one instance of
>> indexer may vary between  2 seconds and 30 minutes. But mostly it is
>> below one minute. There is nothing else running on the machine except
>> system stuff and daemons. After several hours of indexing i can see a lot
>> of  active memory, it's ok. Then i check the number of vnodes. and it's
>> really huge: 300k+ even tho nobody has so many opened files. Reading docs
>> and googling i figured that's because of cached pages that reside in
>> memory (unmounting of disk causes whole memory to be freed).  Also I
>> figured that happens only when I am accessing files via mmap().
>>
>> Looks like pretty legit behaviour but the issue is:
>> This spectacle continues (approximately for 12 hours) unlit indexers
>> began to be killed out of swap. As I wrote above I observe a lot of used
>> vnodes and like 7GB of active memory. I made a tool that allocates memory
>> using malloc() to check what's the limit of available memory that can be
>> allocated. It is several megabytes, sometimes more. Unless that tool gets
>> killed out of swap as well. So how i can see the issue: for some reason
>> after some process had exited normally all mapped pages don't get freed.
>> I red about and I agree  that this is reasonable behaviour if we have
>> spare memory. But following this logic these pages can be flushed back to
>> file at any time when system is under stress conditions. So when I ask
>> for a piece of RAM, OS should do that trick and give me what I ask. But
>> that's never happens. Those pages are like frozen. Until I unmount disk.
>> Even after there is not a single instance of indexer running.
>>
>> I believe all this is caused by mmap() for sure : BDB uses mmap() for
>> accessing databases and we tested indexing with out pushing data to DB.
>> Worked shiny. You may suggest that that's something wrong with BDB. But
>> we have some more tools of ours that using mmap() as well and the
>> behaviour is exact.
>>
>> Thank you. Paul, Ukraine.
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs>; To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> Hi Paul,
>
> Are you using dedup anywhere on that pool?
>
> Also, could you please post the full zfs-stats -a
>
>


-- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F3B98C9.1090400>