From owner-freebsd-performance@FreeBSD.ORG Tue Aug 28 19:47:54 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E1366106566B; Tue, 28 Aug 2012 19:47:54 +0000 (UTC) (envelope-from gezeala@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id 988D88FC1B; Tue, 28 Aug 2012 19:47:54 +0000 (UTC) Received: by dadr6 with SMTP id r6so3561018dad.13 for ; Tue, 28 Aug 2012 12:47:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=olBVSj2uePOGhwJah/El8Q4GmAXlpK/KXi9K6tS9ynA=; b=Zj7njZx9u53BQTiwvxUwaOq5TnHD3BjpaHAWAY+a1CE8SWDYa3wU/ttADa50cGaksS 6Pu9pKucvXb6wD9gSH3vNdBWo8U2bUXb2KdjCjk8VAISfmOm2FiFB1x7g/fqAayeUziM bNn6PscecGCTK3oBhTN8fxOhNpJxcvgAcbTMX+CK8u/FtvJvutGrLAzR2Oz1bVb5QTgd 0oj4OokXQgB1MQ+jRjHtYS3xj/ZhDZ7gqpLtKeohiRoDstvsXUxVqIAA8YFXwlZcC5GP oOc0SRW+l6sKtkqFHcY1w4QZnQbzI5EmLS3O/d6BK0izDdRPYpBFxEM/T/C5VfaLipZi IK7w== Received: by 10.66.76.226 with SMTP id n2mr39694486paw.67.1346183272051; Tue, 28 Aug 2012 12:47:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.54.234 with HTTP; Tue, 28 Aug 2012 12:47:29 -0700 (PDT) In-Reply-To: <503D16FC.2080903@rice.edu> References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> <50325634.7090904@rice.edu> <503418C0.5000901@rice.edu> <50367E5D.1020702@rice.edu> <503D16FC.2080903@rice.edu> From: =?ISO-8859-1?Q?Gezeala_M=2E_Bacu=F1o_II?= Date: Tue, 28 Aug 2012 12:47:29 -0700 Message-ID: To: Alan Cox Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov , kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Aug 2012 19:47:55 -0000 On Tue, Aug 28, 2012 at 12:07 PM, Alan Cox wrote: > On 08/27/2012 17:23, Gezeala M. Bacu=F1o II wrote: >> >> On Thu, Aug 23, 2012 at 12:02 PM, Alan Cox wrote: >>> >>> On 08/22/2012 12:09, Gezeala M. Bacu=F1o II wrote: >>>> >>>> On Tue, Aug 21, 2012 at 4:24 PM, Alan Cox wrote: >>>>> >>>>> On 8/20/2012 8:26 PM, Gezeala M. Bacu=F1o II wrote: >>>>>> >>>>>> On Mon, Aug 20, 2012 at 9:07 AM, Gezeala M. Bacu=F1o >>>>>> II >>>>>> wrote: >>>>>>> >>>>>>> On Mon, Aug 20, 2012 at 8:22 AM, Alan Cox wrote: >>>>>>>> >>>>>>>> On 08/18/2012 19:57, Gezeala M. Bacu=F1o II wrote: >>>>>>>>> >>>>>>>>> On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox wrote= : >>>>>>>>>> >>>>>>>>>> On 08/17/2012 17:08, Gezeala M. Bacu=F1o II wrote: >>>>>>>>>>> >>>>>>>>>>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> vm.kmem_size controls the maximum size of the kernel's heap, >>>>>>>>>>>> i.e., >>>>>>>>>>>> the >>>>>>>>>>>> region where the kernel's slab and malloc()-like memory >>>>>>>>>>>> allocators >>>>>>>>>>>> obtain >>>>>>>>>>>> their memory. While this heap may occupy the largest portion = of >>>>>>>>>>>> the >>>>>>>>>>>> kernel's virtual address space, it cannot occupy the entirety = of >>>>>>>>>>>> the >>>>>>>>>>>> address >>>>>>>>>>>> space. There are other things that must be given space within >>>>>>>>>>>> the >>>>>>>>>>>> kernel's >>>>>>>>>>>> address space, for example, the file system buffer map. >>>>>>>>>>>> >>>>>>>>>>>> ZFS does not, however, use the regular file system buffer cach= e. >>>>>>>>>>>> The >>>>>>>>>>>> ARC >>>>>>>>>>>> takes its place, and the ARC abuses the kernel's heap like >>>>>>>>>>>> nothing >>>>>>>>>>>> else. >>>>>>>>>>>> So, if you are running a machine that only makes trivial use o= f >>>>>>>>>>>> a >>>>>>>>>>>> non-ZFS >>>>>>>>>>>> file system, like you boot from UFS, but store all of your dat= a >>>>>>>>>>>> in >>>>>>>>>>>> ZFS, >>>>>>>>>>>> then >>>>>>>>>>>> you can dramatically reduce the size of the buffer map via boo= t >>>>>>>>>>>> loader >>>>>>>>>>>> tuneables and proportionately increase vm.kmem_size. >>>>>>>>>>>> >>>>>>>>>>>> Any further increases in the kernel virtual address space size >>>>>>>>>>>> will, >>>>>>>>>>>> however, require code changes. Small changes, but changes >>>>>>>>>>>> nonetheless. >>>>>>>>>>>> >>>>>>>>>>>> Alan >>>>>>>>>>>> >>>>>>> <> >>>>>>>>>> >>>>>>>>>> Your objective should be to reduce the value of "sysctl >>>>>>>>>> vfs.maxbufspace". >>>>>>>>>> You can do this by setting the loader.conf tuneable >>>>>>>>>> "kern.maxbcache" >>>>>>>>>> to >>>>>>>>>> the >>>>>>>>>> desired value. >>>>>>>>>> >>>>>>>>>> What does your machine currently report for "sysctl >>>>>>>>>> vfs.maxbufspace"? >>>>>>>>>> >>>>>>>>> Here you go: >>>>>>>>> vfs.maxbufspace: 54967025664 >>>>>>>>> kern.maxbcache: 0 >>>>>>>> >>>>>>>> >>>>>>>> Try setting kern.maxbcache to two billion and adding 50 billion to >>>>>>>> the >>>>>>>> setting of vm.kmem_size{,_max}. >>>>>>>> >>>>>> 2 : 50 =3D=3D>> is this the ratio for further tuning >>>>>> kern.maxbcache:vm.kmem_size? Is kern.maxbcache also in bytes? >>>>>> >>>>> No, this is not a ratio. Yes, kern.maxbcache is in bytes. Basically, >>>>> for >>>>> every byte that you subtract from vfs.maxbufspace, through setting >>>>> kern.maxbcache, you can add a byte to vm.kmem_size{,_max}. >>>>> >>>>> Alan >>>>> >>>> Great! Thanks. Are there other sysctls aside from vfs.bufspace that I >>>> should monitor for vfs.maxbufspace usage? I just want to make sure >>>> that vfs.maxbufspace is sufficient for our needs. >>> >>> >>> You might keep an eye on "sysctl vfs.bufdefragcnt". If it starts rapid= ly >>> increasing, you may want to increase vfs.maxbufspace. >>> >>> Alan >>> >> We seem to max out vfs.bufspace in<24hrs uptime. It has been steady >> at 1999273984 while vfs.bufdefragcnt stays at 0 - which I presume is >> good. Nevertheless, I will increase kern.maxbcache to 6GB and adjust >> vm.kmem_size{,_max}, vfs.zfs.arc_max accordingly. On another machine >> with vfs.maxbufspace auto-tuned to 7738671104 (~7.2GB), vfs.bufspace >> is now at 5278597120 (uptime 129 days). > > > The buffer map is a kind of cache. Like any cache, most of the time it w= ill > be full. Don't worry. > > Moreover, even when the buffer map is full, the UFS file system is cachin= g > additional file data in physical memory pages that simply aren't mapped f= or > instantaneous access. Essentially, limiting the size of the buffer map i= s > only limiting the amount of modified file data that hasn't been written b= ack > to disk, not the total amount of cached data. > > As long as you're making trivial use of UFS file systems, there really is= n't > a reason to increase the buffer map size. > > Alan > > I see. Makes sense now. Thanks! I forgot to mention that we do have smbfs mounts mounted from another server, are writes/modifications on files on these mounts also cached in the buffer map? All non-ZFS file systems right? Input/Output files are read from or written to these mounts.