From owner-freebsd-hackers@FreeBSD.ORG  Sun Sep 19 11:41:19 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9C6941065672;
	Sun, 19 Sep 2010 11:41:19 +0000 (UTC)
	(envelope-from rwatson@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 738DA8FC15;
	Sun, 19 Sep 2010 11:41:19 +0000 (UTC)
Received: from [127.0.0.1] (rhee.cl.cam.ac.uk [128.232.1.202])
	by cyrus.watson.org (Postfix) with ESMTPSA id 3536746B5C;
	Sun, 19 Sep 2010 07:41:18 -0400 (EDT)
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: text/plain; charset=us-ascii
From: "Robert N. M. Watson" <rwatson@freebsd.org>
In-Reply-To: <4C95C804.1010701@freebsd.org>
Date: Sun, 19 Sep 2010 12:41:16 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <8D2A1836-CA85-4F1B-A5A5-9B75A8E2DA51@freebsd.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
	<alpine.BSF.2.00.1009181135430.23448@desktop>
	<4C95C804.1010701@freebsd.org>
To: Andriy Gapon <avg@freebsd.org>
X-Mailer: Apple Mail (2.1081)
Cc: Andre Oppermann <andre@freebsd.org>, Jeff Roberson <jeff@freebsd.org>,
	Jeff Roberson <jroberson@jroberson.net>, freebsd-hackers@freebsd.org
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Sep 2010 11:41:19 -0000


On 19 Sep 2010, at 09:21, Andriy Gapon wrote:

>> I believe the combination of these approaches would significantly =
solve the
>> problem and should be relatively little new code.  It should also =
preserve the
>> adaptable nature of the system without penalizing resource heavy =
systems.  I
>> would be happy to review patches from anyone who wishes to undertake =
it.
>=20
> FWIW, the approach of simply limiting maximum bucket size based on =
item size
> seems to work rather well too, as my testing with zfs+uma shows.
> I will also try to add code to completely bypass the per-cpu cache for =
"really
> huge" items.

This is basically what malloc(9) does already: for small items, it =
allocates from a series of fixed-size buckets (which could probably use =
tuning), but maintains its own stats with respect to the types it maps =
into the buckets. This is why there's double-counting between vmstat -z =
and vmstat -m, since the former shows the buckets used to allocate the =
latter.

For large items, malloc(9) goes through UMA, but it's basically a =
pass-through to VM, which directly provides pages. This means that for =
small malloc types, you get per-CPU caches, and for large malloc types, =
you don't.

malloc(9) doesn't require fixed-size allocations, but also can't provide =
the ctor/dtor partial tear-down caching, nor different effective working =
sets of memory for different types.

UMA should really only be used directly for memory types where tight =
packing, per-CPU caching, and possibly partial tear-down, have benefits. =
mbufs are a great example, because we allocate tons and tons of them =
continuously in operation. More stable types allocated in smaller =
quantities make very little sense, since we waste lots of memory =
overhead in allocating buckets that won't be used, etc.

Robert=