Date: Fri, 17 Aug 2012 15:08:03 -0700 From: =?ISO-8859-1?Q?Gezeala_M=2E_Bacu=F1o_II?= <gezeala@gmail.com> To: Alan Cox <alc@rice.edu> Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov <andrey@zonov.org>, kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) Message-ID: <CAJKO3mWEXUvLtdSvmjgNhhyVqw4j0DuTYm9MqLd9=i9==WLAaA@mail.gmail.com> In-Reply-To: <502EB081.3030801@rice.edu> References: <CAJKO3mU8bfn=jmWNSpvAXOR1AWyAAM0Sio1D1PnOYg8P59V9cg@mail.gmail.com> <CAGH67wS=jue7%2B92jSCyaydOLHC=hPwtndV64FVtC7nhDsPvFng@mail.gmail.com> <CAGH67wTNfW45pgJ_%2BVn_sX%2BP9M5B5wzPT9270dRmWjYF6KerrA@mail.gmail.com> <B74BE4AB-AB67-45BD-BFC3-9AE33A85751C@gmail.com> <502DEAD9.6050304@zonov.org> <CAJKO3mVWOFa9Cby_EWsf_OFHux7YBGSV7aGYSP2YANeJkqZtoQ@mail.gmail.com> <CAJKO3mU1NdkQwNSEDk3wWyLN700=dQ0_jSXt_sx-ABpywNjfsg@mail.gmail.com> <502EB081.3030801@rice.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox <alc@rice.edu> wrote: > vm.kmem_size controls the maximum size of the kernel's heap, i.e., the > region where the kernel's slab and malloc()-like memory allocators obtain > their memory. While this heap may occupy the largest portion of the > kernel's virtual address space, it cannot occupy the entirety of the address > space. There are other things that must be given space within the kernel's > address space, for example, the file system buffer map. > > ZFS does not, however, use the regular file system buffer cache. The ARC > takes its place, and the ARC abuses the kernel's heap like nothing else. > So, if you are running a machine that only makes trivial use of a non-ZFS > file system, like you boot from UFS, but store all of your data in ZFS, then > you can dramatically reduce the size of the buffer map via boot loader > tuneables and proportionately increase vm.kmem_size. > > Any further increases in the kernel virtual address space size will, > however, require code changes. Small changes, but changes nonetheless. > > Alan > > <<snip>> >> >> Additional Info: >> 1] Installed using PCBSD-9 Release amd64. >> >> 2] uname -a >> FreeBSD fmt-iscsi-stg1.musicreports.com 9.0-RELEASE FreeBSD >> 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011 >> >> root@build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC >> amd64 >> >> 3] first few lines from /var/run/dmesg.boot: >> FreeBSD 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011 >> >> root@build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC >> amd64 >> CPU: Intel(R) Xeon(R) CPU E7- 8837 @ 2.67GHz (2666.82-MHz K8-class CPU) >> Origin = "GenuineIntel" Id = 0x206f2 Family = 6 Model = 2f Stepping >> = 2 >> >> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> >> >> Features2=0x29ee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI> >> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> >> AMD Features2=0x1<LAHF> >> TSC: P-state invariant, performance statistics >> real memory = 549755813888 (524288 MB) >> avail memory = 530339893248 (505771 MB) >> Event timer "LAPIC" quality 600 >> ACPI APIC Table: <ALASKA A M I> >> FreeBSD/SMP: Multiprocessor System Detected: 64 CPUs >> FreeBSD/SMP: 8 package(s) x 8 core(s) >> >> 4] relevant sysctl's with manual tuning: >> kern.maxusers: 384 >> kern.maxvnodes: 8222162 >> vfs.numvnodes: 675740 >> vfs.freevnodes: 417524 >> kern.ipc.somaxconn: 128 >> kern.openfiles: 5238 >> vfs.zfs.arc_max: 428422987776 >> vfs.zfs.arc_min: 53552873472 >> vfs.zfs.arc_meta_used: 3167391088 >> vfs.zfs.arc_meta_limit: 107105746944 >> vm.kmem_size_max: 429496729600 ==>> manually tuned >> vm.kmem_size: 429496729600 ==>> manually tuned >> vm.kmem_map_free: 107374727168 >> vm.kmem_map_size: 144625156096 >> vfs.wantfreevnodes: 2055540 >> kern.minvnodes: 2055540 >> kern.maxfiles: 197248 ==>> manually tuned >> vm.vmtotal: >> System wide totals computed every five seconds: (values in kilobytes) >> =============================================== >> Processes: (RUNQ: 1 Disk Wait: 1 Page Wait: 0 Sleep: 150) >> Virtual Memory: (Total: 1086325716K Active: 12377876K) >> Real Memory: (Total: 144143408K Active: 803432K) >> Shared Virtual Memory: (Total: 81384K Active: 37560K) >> Shared Real Memory: (Total: 32224K Active: 27548K) >> Free Memory Pages: 365565564K >> >> hw.availpages: 134170294 >> hw.physmem: 549561524224 >> hw.usermem: 391395241984 >> hw.realmem: 551836188672 >> vm.kmem_size_scale: 1 >> kern.ipc.nmbclusters: 2560000 ==>> manually tuned >> kern.ipc.maxsockbuf: 2097152 >> net.inet.tcp.sendbuf_max: 2097152 >> net.inet.tcp.recvbuf_max: 2097152 >> kern.maxfilesperproc: 18000 >> net.inet.ip.intr_queue_maxlen: 256 >> kern.maxswzone: 33554432 >> kern.ipc.shmmax: 10737418240 ==>> manually tuned >> kern.ipc.shmall: 2621440 ==>> manually tuned >> vfs.zfs.write_limit_override: 0 >> vfs.zfs.prefetch_disable: 0 >> hw.pagesize: 4096 >> hw.availpages: 134170294 >> kern.ipc.maxpipekva: 8586895360 >> kern.ipc.shm_use_phys: 1 ==>> manually tuned >> vfs.vmiodirenable: 1 >> debug.numcache: 632148 >> vfs.ncsizefactor: 2 >> vm.kvm_size: 549755809792 >> vm.kvm_free: 54456741888 >> kern.ipc.semmni: 256 >> kern.ipc.semmns: 512 >> kern.ipc.semmnu: 256 >> Thanks. It will be mainly used for postgreSQL and java. We have a huge db (3TB and growing) and we need to have as much of it as we can on zfs' ARC. All data resides on zpools while root is on ufs. On 8.2 and 9 machines vm.kmem_size is always auto-tuned to almost the same size as our installed RAM. What I've tuned on those machines is lower vfs.zfs.arc_max to 50% or 75% of vm.kmem_size and that have worked well for us and the machines does not swap out. Now on this machine, I do think that I need to adjust my formula for tuning vfs.zfs.arc_max, 25% for other stuff is probably overkill. We were able to successfully bump vm.kmem_size_max and vm.kmem_size to 400GB: vm.kmem_size_max: 429496729600 ==>> manually tuned vm.kmem_size: 429496729600 ==>> manually tuned vfs.zfs.arc_max: 428422987776 ==>> auto-tuned (vm.kmem_size - 1G) vfs.zfs.arc_min: 53552873472 ==>> auto-tuned Which other tuneables do I need to set on /boot/loader.conf so we can boot the machine with vm.kmem_size > 400G. As I don't know which part of the boot-up process is failing with vm.kmem_size/_max set to 450G or 500G, I have no idea which to tune next. Thanks in advance.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJKO3mWEXUvLtdSvmjgNhhyVqw4j0DuTYm9MqLd9=i9==WLAaA>