From owner-freebsd-hackers@FreeBSD.ORG  Thu Jun 30 03:38:00 2005
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: hackers@FreeBSD.org
Delivered-To: freebsd-hackers@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7C71416A41C;
	Thu, 30 Jun 2005 03:38:00 +0000 (GMT)
	(envelope-from jroberson@chesapeake.net)
Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com
	[12.26.83.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id EDC8243D1D;
	Thu, 30 Jun 2005 03:37:59 +0000 (GMT)
	(envelope-from jroberson@chesapeake.net)
Received: from [10.0.0.1] (67-40-23-118.tukw.qwest.net [67.40.23.118])
	(authenticated bits=0)
	by webaccess-cl.virtdom.com (8.13.1/8.13.1) with ESMTP id
	j5U3buRU015803; Wed, 29 Jun 2005 23:37:57 -0400 (EDT)
	(envelope-from jroberson@chesapeake.net)
Date: Wed, 29 Jun 2005 20:37:17 -0700 (PDT)
From: Jeff Roberson <jroberson@chesapeake.net>
X-X-Sender: jroberson@10.0.0.1
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <20050629234429.A87930@fledge.watson.org>
Message-ID: <20050629203638.V12343@10.0.0.1>
References: <000d01c57cf7$b9b6f9f0$29931bd9@ertpc>
	<20050629234429.A87930@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Virus-Scanned: clamd / ClamAV version 0.75.1, clamav-milter version 0.75c
	on webaccess-cl.virtdom.com
X-Virus-Status: Clean
X-Mailman-Approved-At: Thu, 30 Jun 2005 12:20:05 +0000
Cc: jeff@FreeBSD.org, hackers@FreeBSD.org, bmilekic@FreeBSD.org,
	ant <andrit@ukr.net>
Subject: Re: hot path optimizations in uma_zalloc() & uma_zfree()
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jun 2005 03:38:00 -0000

On Wed, 29 Jun 2005, Robert Watson wrote:

>
> On Thu, 30 Jun 2005, ant wrote:
>
>> I just tryed to make buckets management in perCPU cache like in Solaris 
>> (see paper of Jeff Bonwick - Magazines and Vmem) and got perfomance gain 
>> around 10% in my test program. Then i made another minor code optimization 
>> and got another 10%. The program just creates and destroys sockets in loop.
>
> This sounds great -- I'm off to bed now (.uk time and all), but will run some 
> benchmarks locally tomorrow.  I've recently started investigating using the 
> PMC support in 6.x to look at cache behavior in the network-related fast 
> paths, but haven't gotten too far as yet.
>
> Thanks,
>
> Robert N M Watson

Do you keep two buckets still?  If so, this is something that I've always 
intended to do, but never got around to.  I'm glad someone has taken the 
initiative.  Will review the patch shortly.

>
>> 
>> I suppose the reason of first gain lies in increasing of cpu cache hits.
>> In current fbsd code allocations and freeings deal with
>> separate buckets. Buckets are changed when one of them
>> became full or empty first. In Solaris this work is pure LIFO:
>> i.e. alloc() and free() work with one bucket - the current bucket
>> (it is called magazine there), that's why cache hit rate is bigger.
>> 
>> Another optimization is very trivial, for example:
>> -   bucket->ub_cnt--;
>> -   item = bucket->ub_bucket[bucket->ub_cnt];
>> +   item = bucket->ub_bucket[--bucket->ub_cnt];
>> (see the patch)
>> 
>> 
>> The test program:
>> 
>> #include <unistd.h>
>> #include <sys/socket.h>
>> 
>> main(int argc, char *argv[])
>> {
>> int *fd, n, i,j, iters=100000;
>> 
>> n = atoi(argv[1]);
>> fd = (int*) malloc(sizeof(*fd) * n);
>> 
>> iters /= n;
>> for (i=0; i<iters; i++) {
>>  for (j=0; j<n; j++)
>>   fd[j] = socket(AF_UNIX, SOCK_STREAM, 0);
>>  for (j=0; j<n; j++)
>>   close(fd[j]);
>> }
>> }
>> 
>> 
>> 
>> The results with current uma_core.c
>> 
>>> time ./sockloop 1                                    # first arg is the
>> number of sockets that created in one iteration
>> 0.093u 2.650s 0:02.75 99.6%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 1
>> 0.108u 2.298s 0:02.41 99.1%     5+181k 0+0io 0pf+0w
>>> time ./sockloop 1
>> 0.127u 2.278s 0:02.41 99.1%     5+177k 0+0io 0pf+0w
>>> time ./sockloop 10                                # number of iterations
>> is changed according to arg (see code)
>> 0.054u 2.239s 0:02.30 99.1%     5+181k 0+0io 0pf+0w
>>> time ./sockloop 10
>> 0.069u 2.199s 0:02.27 99.1%     6+184k 0+0io 0pf+0w
>>> time ./sockloop 10
>> 0.086u 2.185s 0:02.28 99.1%     5+178k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.101u 2.393s 0:02.51 99.2%     5+179k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.085u 2.505s 0:02.60 99.2%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.054u 2.441s 0:02.50 99.6%     5+178k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.093u 2.739s 0:02.84 99.2%     5+181k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.085u 2.797s 0:02.89 99.3%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.117u 2.689s 0:02.82 98.9%     5+179k 0+0io 0pf+0w
>> 
>> 
>> The results of first optimization (only buckets management)
>> 
>>> time ./sockloop 1
>> 0.125u 1.938s 0:02.06 99.5%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 1
>> 0.070u 1.993s 0:02.06 100.0%    5+180k 0+0io 0pf+0w
>>> time ./sockloop 1
>> 0.110u 1.953s 0:02.06 100.0%    5+177k 0+0io 0pf+0w
>>> time ./sockloop 10
>> 0.093u 1.776s 0:01.87 99.4%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 10
>> 0.116u 1.754s 0:01.87 99.4%     5+181k 0+0io 0pf+0w
>>> time ./sockloop 10
>> 0.093u 1.777s 0:01.87 99.4%     5+181k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.100u 2.182s 0:02.29 99.5%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.093u 2.174s 0:02.27 99.5%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.078u 2.158s 0:02.24 99.1%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.101u 2.403s 0:02.51 99.6%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.124u 2.381s 0:02.52 99.2%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.125u 2.373s 0:02.51 99.2%     5+178k 0+0io 0pf+0w
>> 
>> 
>> 
>> The results of both optimizations
>> 
>>> time ./sockloop 1
>> 0.062u 1.785s 0:01.85 99.4%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 1
>> 0.124u 1.722s 0:01.85 99.4%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 1
>> 0.087u 1.759s 0:01.85 98.9%     5+177k 0+0io 0pf+0w
>>> time ./sockloop 10
>> 0.069u 1.684s 0:01.75 99.4%     5+181k 0+0io 0pf+0w
>>> time ./sockloop 10
>> 0.070u 1.673s 0:01.74 100.0%    5+180k 0+0io 0pf+0w
>>> time ./sockloop 10
>> 0.070u 1.672s 0:01.74 100.0%    5+177k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.077u 2.102s 0:02.18 99.5%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.116u 2.062s 0:02.18 99.5%     5+180k 0+0io 0pf+0w
>>> time ./sockloop 100
>> 0.055u 2.126s 0:02.19 99.0%     5+178k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.077u 2.298s 0:02.39 98.7%     5+181k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.070u 2.340s 0:02.42 99.5%     5+178k 0+0io 0pf+0w
>>> time ./sockloop 1000
>> 0.054u 2.320s 0:02.39 99.1%     5+179k 0+0io 0pf+0w
>> 
>> 
>> the patch is for uma_core.c from RELENG_5, but i checked
>> uma_core.c in CURRENT - it's the same regarding to thiese
>> improvements. I don't have any commit rights, so the patch
>> is just for reviewing. Here it is:
>> 
>> --- sys/vm/uma_core.c.orig Wed Jun 29 21:46:52 2005
>> +++ sys/vm/uma_core.c Wed Jun 29 23:09:32 2005
>> @@ -1830,8 +1830,7 @@
>> 
>>  if (bucket) {
>>   if (bucket->ub_cnt > 0) {
>> -   bucket->ub_cnt--;
>> -   item = bucket->ub_bucket[bucket->ub_cnt];
>> +   item = bucket->ub_bucket[--bucket->ub_cnt];
>> #ifdef INVARIANTS
>>    bucket->ub_bucket[bucket->ub_cnt] = NULL;
>> #endif
>> @@ -2252,7 +2251,7 @@
>>  cache = &zone->uz_cpu[cpu];
>> 
>> zfree_start:
>> - bucket = cache->uc_freebucket;
>> + bucket = cache->uc_allocbucket;
>> 
>>  if (bucket) {
>>   /*
>> @@ -2263,8 +2262,7 @@
>>   if (bucket->ub_cnt < bucket->ub_entries) {
>>    KASSERT(bucket->ub_bucket[bucket->ub_cnt] == NULL,
>>        ("uma_zfree: Freeing to non free bucket index."));
>> -   bucket->ub_bucket[bucket->ub_cnt] = item;
>> -   bucket->ub_cnt++;
>> +   bucket->ub_bucket[bucket->ub_cnt++] = item;
>> #ifdef INVARIANTS
>>    ZONE_LOCK(zone);
>>    if (keg->uk_flags & UMA_ZONE_MALLOC)
>> @@ -2275,7 +2273,7 @@
>> #endif
>>    CPU_UNLOCK(cpu);
>>    return;
>> -  } else if (cache->uc_allocbucket) {
>> +  } else if (cache->uc_freebucket) {
>> #ifdef UMA_DEBUG_ALLOC
>>    printf("uma_zfree: Swapping buckets.\n");
>> #endif
>> @@ -2283,8 +2281,7 @@
>>     * We have run out of space in our freebucket.
>>     * See if we can switch with our alloc bucket.
>>     */
>> -   if (cache->uc_allocbucket->ub_cnt <
>> -       cache->uc_freebucket->ub_cnt) {
>> +   if (cache->uc_freebucket->ub_cnt == 0) {
>>     bucket = cache->uc_freebucket;
>>     cache->uc_freebucket = cache->uc_allocbucket;
>>     cache->uc_allocbucket = bucket;
>> 
>> 
>> if one will decide to commit first optimization (about buckets),
>> then there must some adjustments be made also
>> regarding correct statistics gathering.
>> 
>> Regards,
>>  Andriy Tkachuk.
>> 
>> 
>> 
>> _______________________________________________
>> freebsd-hackers@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>> 
>