From owner-freebsd-arch@FreeBSD.ORG  Wed Dec 17 08:20:53 2008
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 863371065672
	for <arch@freebsd.org>; Wed, 17 Dec 2008 08:20:53 +0000 (UTC)
	(envelope-from paul.m.saab@gmail.com)
Received: from wf-out-1314.google.com (wf-out-1314.google.com [209.85.200.175])
	by mx1.freebsd.org (Postfix) with ESMTP id 561978FC17
	for <arch@freebsd.org>; Wed, 17 Dec 2008 08:20:53 +0000 (UTC)
	(envelope-from paul.m.saab@gmail.com)
Received: by wf-out-1314.google.com with SMTP id 24so3839563wfg.7
	for <arch@freebsd.org>; Wed, 17 Dec 2008 00:20:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:message-id:date:from:sender
	:to:subject:cc:in-reply-to:mime-version:content-type:references
	:x-google-sender-auth;
	bh=+/6w5MhKKPmGIQTOXy7CI7hppoBDR2uef7VJq2fUjsg=;
	b=FPwwd6WvHh5kGDNJ4wH3KB16OZg0Dsxu8103Cm28qQmzlsz0lIREuWeCHOiuf+cWKv
	t0Uks7ZlnDVAtyoSWzQB3hA39/y59g3EdZ0pIav6l98gpMJ4XngWptEIkRwWLWpjvXjd
	Qtl2YHfoVeMZqw3mrzG6S3PR87vfyZgkH7WuY=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version
	:content-type:references:x-google-sender-auth;
	b=QouGF/9iOrzF498qgEEqC04gB4OGT3W6vUm0iHDufAia7C8JzIek53uCDR1RY7TIMi
	vX96w8Lcyk5p+frXbFxtGsYPf2HQocAu8APglD2PMfvn7HfoBkNvGh2OYFG4rtptGDL7
	+3Q4R+uqpSJZZBUsBdmJAuuiWOpEPlOBkkuSg=
Received: by 10.142.135.13 with SMTP id i13mr193077wfd.8.1229500190709;
	Tue, 16 Dec 2008 23:49:50 -0800 (PST)
Received: by 10.142.125.20 with HTTP; Tue, 16 Dec 2008 23:49:50 -0800 (PST)
Message-ID: <5c0ff6a70812162349n38395f84o45020f334cd09853@mail.gmail.com>
Date: Tue, 16 Dec 2008 23:49:50 -0800
From: "Paul Saab" <ps@mu.org>
Sender: paul.m.saab@gmail.com
To: "Jeff Roberson" <jroberson@jroberson.net>
In-Reply-To: <20081209155714.K960@desktop>
MIME-Version: 1.0
References: <20081209155714.K960@desktop>
X-Google-Sender-Auth: 656124c1ed1f874f
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: arch@freebsd.org
Subject: Re: UMA & mbuf cache utilization.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Dec 2008 08:20:53 -0000

So far testing has shown in a pure transmit test, that this doesn't hurt
performance at all.

On Tue, Dec 9, 2008 at 6:22 PM, Jeff Roberson <jroberson@jroberson.net>wrote:

> Hello,
>
> Nokia has graciously allowed me to release a patch which I developed to
> improve general mbuf and cluster cache behavior.  This is based on others
> observations that due to simple alignment at 2k and 256k we achieve a poor
> cache distribution for the header area of packets and the most heavily used
> mbuf header fields.  In addition, modern machines stripe memory access
> across several memories and even memory controllers.  Accessing heavily
> aligned locations such as these can also create load imbalances among
> memories.
>
> To solve this problem I have added two new features to UMA.  The first is
> the zone flag UMA_ZONE_CACHESPREAD.  This flag modifies the meaning of the
> alignment field such that start addresses are staggered by at least align +
> 1 bytes.  In the case of clusters and mbufs this means adding
> uma_cache_align + 1 bytes to the amount of storage allocated.  This creates
> a certain constant amount of waste, 3% and 12% respectively.  It also means
> we must use contiguous physical and virtual memory consisting of several
> pages to efficiently use the memory and land on as many cache lines as
> possible.
>
> Because contiguous physical memory is not always available, the allocator
> had to have a fallback mechanism.  We don't simply want to have all mbuf
> allocations check two zones as once we deplete available contiguous memory
> the check on the first zone will always fail using the most expensive code
> path.
>
> To resolve this issue, I added the ability for secondary zones to stack on
> top of multiple primary zones.  Secondary zones are zones which get their
> storage from another zone but handle their own caching, ctors, dtors, etc.
> By adding this feature a secondary zone can be created that can allocate
> either from the contiguous memory pool or the non-contiguous single-page
> pool depending on availability.  It is also much faster to fail between them
> deep in the allocator because it is only required when we exhaust the
> already available mbuf memory.
>
> For mbufs and clusters there are now three zones each.  A contigmalloc
> backed zone, a single-page allocator zone, and a secondary zone with the
> original zome_mbuf or zone_clust name.  The packet zone also takes from both
> available mbuf zones.  The individual backend zones are not exposed outside
> of kern_mbuf.c.
>
> Currently, each backend zone can have its own limit.  The secondary zone
> only blocks when both are full.  Statistic wise the limit should be reported
> as the sum of the backend limits, however, that isn't presently done.  The
> secondary zone can not have its own limit independent of the backends at
> this time.  I'm not sure if that's valuable or not.
>
> I have test results from nokia which show a dramatic improvement in several
> workloads but which I am probably not at liberty to discuss.  I'm in the
> process of convincing Kip to help me get some benchmark data on our stack.
>
> Also as part of the patch I renamed a few functions since many were
> non-obvious and grew new keg abstractions to tidy things up a bit.  I
> suspect those of you with UMA experience (robert, bosko) will find the
> renaming a welcome improvement.
>
> The patch is available at:
> http://people.freebsd.org/~jeff/mbuf_contig.diff
>
> I would love to hear any feedback you may have.  I have been developing
> this and testing various version off and on for months, however, this is a
> fresh port to current and it is a little green so should be considered
> experimental.
>
> In particular, I'm most nervous about how the vm will respond to new
> pressure on contig physical pages.  I'm also interested in hearing from
> embedded/limited memory people about how we might want to limit or tune
> this.
>
> Thanks,
> Jeff
> _______________________________________________
> freebsd-arch@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
>
>