From owner-freebsd-current@FreeBSD.ORG Tue Jan 8 18:59:09 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 09A4716A417; Tue, 8 Jan 2008 18:59:09 +0000 (UTC) (envelope-from vadim_nuclight@mail.ru) Received: from mx27.mail.ru (mx27.mail.ru [194.67.23.23]) by mx1.freebsd.org (Postfix) with ESMTP id B0FF113C447; Tue, 8 Jan 2008 18:59:08 +0000 (UTC) (envelope-from vadim_nuclight@mail.ru) Received: from [82.117.84.33] (port=40240 helo=nuclight.avtf.net) by mx27.mail.ru with esmtp id 1JCJfF-0007kH-00; Tue, 08 Jan 2008 21:59:06 +0300 Date: Wed, 09 Jan 2008 00:58:56 +0600 To: "Robert Watson" References: <20080104163352.GA42835@lor.one-eyed-alien.net> <9bbcef730801040958t36e48c9fjd0fbfabd49b08b97@mail.gmail.com> <200801061051.26817.peter.schuller@infidyne.com> <9bbcef730801060458k4bc9f2d6uc3f097d70e087b68@mail.gmail.com> <4780D289.7020509@FreeBSD.org> <4780E546.9050303@FreeBSD.org> <9bbcef730801060651y489f1f9bw269d0968407dd8fb@mail.gmail.com> <4780EF09.4090908@FreeBSD.org> <47810BE3.4080601@FreeBSD.org> <4781113C.3090904@FreeBSD.org> <47814B53.50405@FreeBSD.org> <20080106223153.V72782@fledge.watson.org> <20080107152305.A19068@fledge.watson.org> <20080107233157.N64281@fledge.watson.org> From: "Vadim Goncharov" Organization: AVTF TPU Hostel Content-Type: text/plain; format=flowed; delsp=yes; charset=koi8-r MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID: In-Reply-To: <20080107233157.N64281@fledge.watson.org> User-Agent: Opera M2/7.54 (Win32, build 3865) Cc: freebsd-current@freebsd.org Subject: Re: When will ZFS become stable? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jan 2008 18:59:09 -0000 08.01.08 @ 05:39 Robert Watson wrote: > On Tue, 8 Jan 2008, Vadim Goncharov wrote: > >>> To make life slightly more complicated, small malloc allocations are >>> actually implemented using uma -- there are a small number of small >>> object size zones reserved for this purpose, and malloc just rounds up >>> to the next such bucket size and allocations from that bucket. For >>> larger sizes, malloc goes through uma, but pretty much directly to VM >>> which makes pages available directly. So when you look at "vmstat -z" >>> output, be aware that some of the information presented there (zones >>> named things like "128", "256", etc) are actually the pools from which >>> malloc allocations come, so there's double-counting. >> >> Yes, I've known it, but didn't known what column names exactly mean. >> Requests/Failures, I guess, is a pure statistics, Size is one element >> size, but why USED + FREE != LIMIT (on whose where limit is non-zero) ? > > Possibly we should rename the "FREE" column to "CACHE" -- the free count > is the number of items in the UMA cache. These may be hung in buckets > off the per-CPU cache, or be spare buckets in the zone. Either way, the > memory has to be reclaimed before it can be used for other purposes, and > generally for complex objects, it can be allocated much more quickly > than going back to VM for more memory. LIMIT is an administrative limit > that may be configured on the zone, and is configured for some but not > all zones. And every unlimited zone after growing on demand can cause kmem_map/kmem_size panics, or some will low-memeory panics with message about another map? > I'll let someone with a bit more VM experience follow up with more > information about how the various maps and submaps relate to each other. That would be good, as I still don'tany idea about exact meaning of those sysctls :-) Thans for explanations, though. How is our Mr. VM nowadays?.. >>> (which can be swapped out under heavy memory load), pipe buffers, and >>> general cached data for the buffer cache / file system, which will be >>> paged out or discarded when memory pressure goes up. >> >> Umm. I think there is no point in swapping disk cache which can be >> discarded, so the most actual part of kernel memory which is swappable >> are anonymous pipe(2) buffers? > > Yes, that's what I meant. There are some other types of pageable kernel > memory, such as memory used for swap-backed md devices. Hmm, I do remember messages about malloc-backed md devices panics (with workaround advices to switch to swap-backed md), yes... -- WBR, Vadim Goncharov