Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Aug 2012 14:07:40 -0500
From:      Alan Cox <alc@rice.edu>
To:        =?ISO-8859-1?Q?=22Gezeala_M=2E_Bacu=F1o_II=22?= <gezeala@gmail.com>
Cc:        alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov <andrey@zonov.org>, kib@freebsd.org
Subject:   Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB)
Message-ID:  <503D16FC.2080903@rice.edu>
In-Reply-To: <CAJKO3mW%2BJ55NFJiJS4sULi9Bq23ZCSj_oBxGN407YhJL=EqvWg@mail.gmail.com>
References:  <CAJKO3mU8bfn=jmWNSpvAXOR1AWyAAM0Sio1D1PnOYg8P59V9cg@mail.gmail.com> <CAGH67wS=jue7%2B92jSCyaydOLHC=hPwtndV64FVtC7nhDsPvFng@mail.gmail.com> <CAGH67wTNfW45pgJ_%2BVn_sX%2BP9M5B5wzPT9270dRmWjYF6KerrA@mail.gmail.com> <B74BE4AB-AB67-45BD-BFC3-9AE33A85751C@gmail.com> <502DEAD9.6050304@zonov.org> <CAJKO3mVWOFa9Cby_EWsf_OFHux7YBGSV7aGYSP2YANeJkqZtoQ@mail.gmail.com> <CAJKO3mU1NdkQwNSEDk3wWyLN700=dQ0_jSXt_sx-ABpywNjfsg@mail.gmail.com> <502EB081.3030801@rice.edu> <CAJKO3mWEXUvLtdSvmjgNhhyVqw4j0DuTYm9MqLd9=i9==WLAaA@mail.gmail.com> <502FE98E.40807@rice.edu> <CAJKO3mVUMRfkUpSuk0fDdnEMc3hr087iH5u8b5N60CnPs-gP1g@mail.gmail.com> <50325634.7090904@rice.edu> <CAJKO3mXPZVhLo=si%2BEoFPGD5R_m297xedRFY-0N__WOsZBaiCA@mail.gmail.com> <CAJKO3mXQ2_XrdxWgE6JRVOpMu_cEBa_=nJCxFDJ%2BJ=f5_OUsPQ@mail.gmail.com> <503418C0.5000901@rice.edu> <CAJKO3mUkjEbY=t6K5MGphMQ_myxUHnScP8gy8v3J%2BARFMf15=g@mail.gmail.com> <50367E5D.1020702@rice.edu> <CAJKO3mW%2BJ55NFJiJS4sULi9Bq23ZCSj_oBxGN407YhJL=EqvWg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 08/27/2012 17:23, Gezeala M. Bacuņo II wrote:
> On Thu, Aug 23, 2012 at 12:02 PM, Alan Cox<alc@rice.edu>  wrote:
>> On 08/22/2012 12:09, Gezeala M. Bacuņo II wrote:
>>> On Tue, Aug 21, 2012 at 4:24 PM, Alan Cox<alc@rice.edu>   wrote:
>>>> On 8/20/2012 8:26 PM, Gezeala M. Bacuņo II wrote:
>>>>> On Mon, Aug 20, 2012 at 9:07 AM, Gezeala M. Bacuņo II<gezeala@gmail.com>
>>>>> wrote:
>>>>>> On Mon, Aug 20, 2012 at 8:22 AM, Alan Cox<alc@rice.edu>   wrote:
>>>>>>> On 08/18/2012 19:57, Gezeala M. Bacuņo II wrote:
>>>>>>>> On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox<alc@rice.edu>    wrote:
>>>>>>>>> On 08/17/2012 17:08, Gezeala M. Bacuņo II wrote:
>>>>>>>>>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox<alc@rice.edu>     wrote:
>>>>>>>>>>> vm.kmem_size controls the maximum size of the kernel's heap, i.e.,
>>>>>>>>>>> the
>>>>>>>>>>> region where the kernel's slab and malloc()-like memory allocators
>>>>>>>>>>> obtain
>>>>>>>>>>> their memory.  While this heap may occupy the largest portion of
>>>>>>>>>>> the
>>>>>>>>>>> kernel's virtual address space, it cannot occupy the entirety of
>>>>>>>>>>> the
>>>>>>>>>>> address
>>>>>>>>>>> space.  There are other things that must be given space within the
>>>>>>>>>>> kernel's
>>>>>>>>>>> address space, for example, the file system buffer map.
>>>>>>>>>>>
>>>>>>>>>>> ZFS does not, however, use the regular file system buffer cache.
>>>>>>>>>>> The
>>>>>>>>>>> ARC
>>>>>>>>>>> takes its place, and the ARC abuses the kernel's heap like nothing
>>>>>>>>>>> else.
>>>>>>>>>>> So, if you are running a machine that only makes trivial use of a
>>>>>>>>>>> non-ZFS
>>>>>>>>>>> file system, like you boot from UFS, but store all of your data in
>>>>>>>>>>> ZFS,
>>>>>>>>>>> then
>>>>>>>>>>> you can dramatically reduce the size of the buffer map via boot
>>>>>>>>>>> loader
>>>>>>>>>>> tuneables and proportionately increase vm.kmem_size.
>>>>>>>>>>>
>>>>>>>>>>> Any further increases in the kernel virtual address space size
>>>>>>>>>>> will,
>>>>>>>>>>> however, require code changes.  Small changes, but changes
>>>>>>>>>>> nonetheless.
>>>>>>>>>>>
>>>>>>>>>>> Alan
>>>>>>>>>>>
>>>>>> <<snip>>
>>>>>>>>> Your objective should be to reduce the value of "sysctl
>>>>>>>>> vfs.maxbufspace".
>>>>>>>>> You can do this by setting the loader.conf tuneable "kern.maxbcache"
>>>>>>>>> to
>>>>>>>>> the
>>>>>>>>> desired value.
>>>>>>>>>
>>>>>>>>> What does your machine currently report for "sysctl
>>>>>>>>> vfs.maxbufspace"?
>>>>>>>>>
>>>>>>>> Here you go:
>>>>>>>> vfs.maxbufspace: 54967025664
>>>>>>>> kern.maxbcache: 0
>>>>>>>
>>>>>>> Try setting kern.maxbcache to two billion and adding 50 billion to the
>>>>>>> setting of vm.kmem_size{,_max}.
>>>>>>>
>>>>> 2 : 50 ==>>   is this the ratio for further tuning
>>>>> kern.maxbcache:vm.kmem_size? Is kern.maxbcache also in bytes?
>>>>>
>>>> No, this is not a ratio.  Yes, kern.maxbcache is in bytes. Basically, for
>>>> every byte that you subtract from vfs.maxbufspace, through setting
>>>> kern.maxbcache, you can add a byte to vm.kmem_size{,_max}.
>>>>
>>>> Alan
>>>>
>>> Great! Thanks. Are there other sysctls aside from vfs.bufspace that I
>>> should monitor for vfs.maxbufspace usage? I just want to make sure
>>> that vfs.maxbufspace is sufficient for our needs.
>>
>> You might keep an eye on "sysctl vfs.bufdefragcnt".  If it starts rapidly
>> increasing, you may want to increase vfs.maxbufspace.
>>
>> Alan
>>
> We seem to max out vfs.bufspace in<24hrs uptime. It has been steady
> at 1999273984 while vfs.bufdefragcnt stays at 0 - which I presume is
> good. Nevertheless, I will increase kern.maxbcache to 6GB and adjust
> vm.kmem_size{,_max}, vfs.zfs.arc_max accordingly. On another machine
> with vfs.maxbufspace auto-tuned to 7738671104 (~7.2GB), vfs.bufspace
> is now at 5278597120 (uptime 129 days).

The buffer map is a kind of cache.  Like any cache, most of the time it 
will be full.  Don't worry.

Moreover, even when the buffer map is full, the UFS file system is 
caching additional file data in physical memory pages that simply aren't 
mapped for instantaneous access.  Essentially, limiting the size of the 
buffer map is only limiting the amount of modified file data that hasn't 
been written back to disk, not the total amount of cached data.

As long as you're making trivial use of UFS file systems, there really 
isn't a reason to increase the buffer map size.

Alan





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?503D16FC.2080903>