From owner-freebsd-current@FreeBSD.ORG Wed Sep 26 18:21:29 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 82E3B16A41A for ; Wed, 26 Sep 2007 18:21:29 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from mail-out4.apple.com (mail-out4.apple.com [17.254.13.23]) by mx1.freebsd.org (Postfix) with ESMTP id 6F54513C4A3 for ; Wed, 26 Sep 2007 18:21:29 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from relay13.apple.com (relay13.apple.com [17.128.113.29]) by mail-out4.apple.com (Postfix) with ESMTP id 37A911312D36; Wed, 26 Sep 2007 11:21:29 -0700 (PDT) Received: from relay13.apple.com (unknown [127.0.0.1]) by relay13.apple.com (Symantec Mail Security) with ESMTP id 11ABE2802F; Wed, 26 Sep 2007 11:21:29 -0700 (PDT) X-AuditID: 1180711d-a1b5cbb000006cd8-8f-46faa328a6f7 Received: from [17.214.13.96] (cswiger1.apple.com [17.214.13.96]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by relay13.apple.com (Apple SCV relay) with ESMTP id DBC6128050; Wed, 26 Sep 2007 11:21:28 -0700 (PDT) In-Reply-To: <46FA2E46.8020303@freebsd.org> References: <20070921102946.T11189@borg> <46F415BF.9010500@FreeBSD.org> <20070921140550.D96923@thebighonker.lerctr.org> <46F41CFF.6080108@FreeBSD.org> <46F58799.1030702@freebsd.org> <46F58B21.8030307@FreeBSD.org> <20070924091558.GB32006@team.vega.ru> <46F78C59.1020801@FreeBSD.org> <20070924080347.O84223@thebighonker.lerctr.org> <20070924144210.GA82735@team.vega.ru> <46F7D7A4.5090007@samsco.org> <46F80A39.3050707@FreeBSD.org> <46F8951C.50904@freebsd.org> <46F8CE67.60206@FreeBSD.org> <46FA2E46.8020303@freebsd.org> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Chuck Swiger Date: Wed, 26 Sep 2007 11:21:27 -0700 To: Darren Reed X-Mailer: Apple Mail (2.752.2) X-Brightmail-Tracker: AAAAAA== Cc: Kris Kennaway , freebsd-current@freebsd.org, Larry Rosenman Subject: Re: panic: kmem_malloc(131072): kmem_map too small (AMD64) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Sep 2007 18:21:29 -0000 On Sep 26, 2007, at 3:02 AM, Darren Reed wrote: >> Yes, Solaris does something architecturally different because it >> is apparently acceptable for zfs to use gigabytes of memory by >> default. > > Well, if you were designing a file system for servers, is there any > reason that you wouldn't try to use all of the RAM available? > > A similar thought process goes into having a unified buffer cache that > uses all the free RAM that it can (on a 1.5GB NetBSD box, 1.4GB > is file cache.) This is a fine example. One of the nice notions of a "unified buffer cache" is that you should only store data once in physical RAM, and just use VMOs to provide additional references (perhaps mapped copy- on-write) rather than double-buffering stuff between a processes' address space to dedicated kernel disk I/O buffers before they can be read or written out. A key factor to note is that buffer cache can be paged out as needed (by definition, pretty much), but historically, kernel memory was "wired down" to prevent people from paging out critical things like the VM subsystem or disk drivers. Wired down memory is (or perhaps was) a scarce resource, which is why things like the memory disk [md] implementation recommends using swap-based backing rather than kernel malloc(9)-based backing. I'm not certain whether FreeBSD's kernel memory allocator [malloc(9), zone(9)] even supports the notion of allocating pageable memory rather than memory taken from the fixed KVA region. The manpage implies that calling kernel malloc with M_WAITOK will always return a valid pointer and not NULL, but I'm not convinced this is will be true if you try allocating something larger than the size of KVA and/ or the amount of physical RAM available in the system. > Even if I'm running a desktop workstation, if I'm not there, there's > no reason that ZFS shouldn't be able to encourage the OS to swap > out all of the applications (well as much as can be) if it so desires. > > The problem comes in deciding how strong ZFS's hold should be > and how to apply pressure from other parts of the system that > want to use the RAM as well. Obviously, it does no good to page out your buffer cache to swap-- so if the system is under enough memory pressure to want to use that memory for other tasks, then the right thing to do is to shrink the buffer cache somewhat to attempt to minimize the global page-fault frequency rate. > Now given that we have a ZFS tuning guide, surely the question > we need to ask ourselves is why can't we take the recommendations > from that and make up some code to implement the trends discussed? > > And how do we measure how much memory ZFS is using? "vmstat -m", or maybe there are some sysctls being exposed by ZFS with that info? Regards, -- -Chuck