From owner-freebsd-hackers@FreeBSD.ORG  Sat Jul 23 17:50:53 2011
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4A6D21065670
	for <freebsd-hackers@freebsd.org>; Sat, 23 Jul 2011 17:50:53 +0000 (UTC)
	(envelope-from alan.l.cox@gmail.com)
Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com
	[209.85.210.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 167698FC14
	for <freebsd-hackers@freebsd.org>; Sat, 23 Jul 2011 17:50:52 +0000 (UTC)
Received: by iyb11 with SMTP id 11so4127859iyb.13
	for <freebsd-hackers@freebsd.org>; Sat, 23 Jul 2011 10:50:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=mime-version:reply-to:in-reply-to:references:date:message-id
	:subject:from:to:cc:content-type;
	bh=TJ7pWGtWsN+I641o5kkE6LgucZ8en46uNETtcx7cqqE=;
	b=QwxGBbO73xoYXL95r3i8Vdh4mS1npDJyPpTM8R01lYCEPPhyYkLW+MVTDxpO7N2IEa
	KPRnbVgl/UNjdGQQVvDcCPkuZ2HKZQfO0/q1BIC2R4lBF2IZvQy3rcEjl+U7x9Fo/qGX
	NcJi93YZxvimNykdY6PBzCPeqC/j3ZMGHUu/I=
MIME-Version: 1.0
Received: by 10.231.26.87 with SMTP id d23mr2767141ibc.18.1311441971035; Sat,
	23 Jul 2011 10:26:11 -0700 (PDT)
Received: by 10.231.191.77 with HTTP; Sat, 23 Jul 2011 10:26:10 -0700 (PDT)
In-Reply-To: <CACYV=-G+Uzw=q8WXqO_D8GLbrEGqKC0H_HL1U_wQCFPH9CeypQ@mail.gmail.com>
References: <CACYV=-G+Uzw=q8WXqO_D8GLbrEGqKC0H_HL1U_wQCFPH9CeypQ@mail.gmail.com>
Date: Sat, 23 Jul 2011 12:26:11 -0500
Message-ID: <CAJUyCcMZfmZ-yFycjWZdovx4zEgKL5irAGtJ1vMDK1VfXjpdvQ@mail.gmail.com>
From: Alan Cox <alan.l.cox@gmail.com>
To: Davide Italiano <davide.italiano@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-hackers@freebsd.org
Subject: Re: UMA large allocations issues
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: alc@freebsd.org
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 23 Jul 2011 17:50:53 -0000

On Fri, Jul 22, 2011 at 9:07 PM, Davide Italiano
<davide.italiano@gmail.com>wrote:

> Hi.
> I'm a student and some time ago I started investigating a bit about
> the performance/fragmentation issue of large allocations within the
> UMA allocator.
> Benckmarks showed up that this problems of performances are mainly
> related to the fact that every call to uma_large_malloc() results in a
> call to kmem_malloc(), and this behaviour is really inefficient.
>
> I started doing some work. Here's somethin:
> First of all, I tried to define larger zones and let uma do it all as
> a first step.
> UMA can allocate slabs of more than one page. So I tried to define
> zones of 1,2,4,8 pages, moving ZMEM_KMAX up.
> I tested the solution w/ raidtest. Here there are some numbers.
>
> Here's the workload characterization:
>
>
> set mediasize=`diskinfo /dev/zvol/tank/vol | awk '{print $3}'`
> set sectorsize=`diskinfo /dev/zvol/tank/vol | awk '{print $2}'`
> raidtest genfile -s $mediasize -S $sectorsize -n 50000
>
> # $mediasize = 10737418240
> # $sectorsize = 512
>
> Number of READ requests: 24924
> Number of WRITE requests: 25076
> Numbers of bytes to transmit: 3305292800
>
>
> raidtest test -d /dev/zvol/tank/vol -n 4
> ## tested using 4 cores, 1.5 GB Ram
>
> Results:
> Number of processes: 4
> Bytes per second: 10146896
> Requests per second: 153
>
> Results: (4* PAGE_SIZE)
> Number of processes: 4
> Bytes per second: 14793969
> Requests per second: 223
>
> Results: (8* PAGE_SIZE)
> Number of processes: 4
> Bytes per second: 6855779
> Requests per second: 103
>
>
> The result of this tests is that defining larger zones is useful until
> the size of these zones is not too big. After some size, performances
> decreases significantly.
>
> As second step, alc@ proposed to create a new layer that sits between
> UMA and the VM subsystem. This layer can manage a pool of chunk that
> can be used to satisfy requests from uma_large_malloc so avoiding the
> overhead due to kmem_malloc() calls.
>
> I've recently started developing a patch (not yet full working) that
> implements this layer. First of all I'd like to concentrate my
> attention to the performance problem rather than the fragmentation
> one. So the patch that actually started to write doesn't care about
> fragmentation aspects.
>
> http://davit.altervista.org/uma_large_allocations.patch
>
> There are some questions to which I wasn't able to answer (for
> example, when it's safe to call kmem_malloc() to request memory).
>
>
In this context, there is really only one restriction.  Your
page_alloc_new() should never call kmem_malloc() with M_WAITOK if your
bitmap_mtx lock is held.  It may only call kmem_malloc() with M_NOWAIT if
your bitmap_mtx lock is held.

That said, I would try to structure the code so that you're not doing any
kmem_malloc() calls with the bitmap_mtx lock held.

So, at the end of the day I'm asking for your opinion about this issue
> and I'm looking for a "mentor" (some kind of guidance) to continue
> this project. If someone is interested to help, it would be very
> appreciated.
>
>
I will take a closer look at your patch later today, and send you comments.

Alan