From owner-freebsd-performance@FreeBSD.ORG  Sat Jan 29 12:54:17 2011
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 38200106566B;
	Sat, 29 Jan 2011 12:54:17 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au
	[211.29.132.190])
	by mx1.freebsd.org (Postfix) with ESMTP id C5BE28FC13;
	Sat, 29 Jan 2011 12:54:16 +0000 (UTC)
Received: from c122-106-165-206.carlnfd1.nsw.optusnet.com.au
	(c122-106-165-206.carlnfd1.nsw.optusnet.com.au [122.106.165.206])
	by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	p0TCsBP5014051
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 29 Jan 2011 23:54:13 +1100
Date: Sat, 29 Jan 2011 23:54:11 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Slawa Olhovchenkov <slw@zxy.spb.ru>
In-Reply-To: <20110129102420.GK18170@zxy.spb.ru>
Message-ID: <20110129233542.O20731@besplex.bde.org>
References: <20110128143355.GD18170@zxy.spb.ru>
	<22E77EED-6455-4164-9115-BBD359EC8CA6@moneybookers.com>
	<20110128161035.GF18170@zxy.spb.ru>
	<CDBFAB7F-1EBC-4B3A-B2F5-6162DD58A93D@moneybookers.com>
	<4D42F87C.7020909@freebsd.org> <20110128172516.GG18170@zxy.spb.ru>
	<20110129070205.Q7034@besplex.bde.org>
	<20110128215215.GJ18170@zxy.spb.ru>
	<20110129133859.O967@besplex.bde.org>
	<20110129102420.GK18170@zxy.spb.ru>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-performance@FreeBSD.org, Julian Elischer <julian@FreeBSD.org>,
	Bruce Evans <brde@optusnet.com.au>,
	Stefan Lambrev <stefan.lambrev@moneybookers.com>
Subject: Re: Interrupt performance
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Jan 2011 12:54:17 -0000

On Sat, 29 Jan 2011, Slawa Olhovchenkov wrote:

> On Sat, Jan 29, 2011 at 02:43:07PM +1100, Bruce Evans wrote:
>
>> On Sat, 29 Jan 2011, Slawa Olhovchenkov wrote:
>>
>>> On Sat, Jan 29, 2011 at 07:52:11AM +1100, Bruce Evans wrote:
>>>>
>>>> To see how much CPU is actually available, run something else and see how
>>>> fast it runs.  A simple counting loops works well on UP systems.
>>>
>>> ===
>>> #include <stdio.h>
>>> #include <sys/time.h>
>>>
>>> int Dummy;
>>>
>>> int
>>> main(int argc, char *argv[])
>>> {
>>> long int count,i,dt;
>>> struct timeval st,et;
>>>
>>> count = atol(argv[1]);
>>>
>>> gettimeofday(&st, NULL);
>>> for(i=count;i;i--) Dummy++;
>>> gettimeofday(&et, NULL);
>>> dt = (et.tv_sec-st.tv_sec)*1000000 + et.tv_usec-st.tv_usec;
>>> printf("Elapsed %d us\n",dt);
>>> }
>>> ===
>>>
>>> This is ok?
>>
>> It's better not to compete with the interrupt handler in the kernel by
>> spinning making syscalls, but that will do for a start.
>
> In this programm inner loop don't contain any syscall.
> Better varian -- loop with syscalls?

Oops.  It is like I meant already.  You could try it with syscalls and/or
heavy memory accesses to see if there is a problem with memory resource
contention (maybe more cache misses).

>>> ./loop 2000000000
>>>
>>> FreeBSD
>>> 1 process: Elapsed 7554193 us
>>> 2 process: Elapsed 14493692 us
>>> netperf + 1 process: Elapsed 21403644 us
>>
>> This shows about 35% user 65% network.
>>
>>> Linux
>>> 1 process: Elapsed 7524843 us
>>> 2 process: Elapsed 14995866 us
>>> netperf + 1 process: Elapsed 14107670 us
>>
>> This shows about 53% user 47% network.
>>
>> So FreeBSD has about 18% more network overhead (absolute: 65-47), or
>> about 38% more network overhead (relative: (65-47)/47).  Not too
>> surprising -- the context switches alone might cost that.
>
> For only 14K vs 56K interrupt. 152% more network overhead per one interrupt.

No, FreeBSD does 4 times as much work per interrupt.  4 times as much
(300%) "overhead" per interrupt is to be expected, since most (hopefully
more than half :-) of the "overhead" is actual work.

> And I see drammaticaly less number of context switches in linux stats
> (by dstat).

FreeBSD uses ithreds for most interrupts, so of course it does many
more context switches (at least 2 per interrupt).  This doesn't make
much difference provided there are not too many.  I think the version
of re that you are using actually uses "fast" interrupts and a task
queue.  This also seems to be making little difference.  You get a
relatively lightweight "fast" interrupt following by followed by a
context switch to and from the task.  IIRC, your statistics showed 
about twice as many context switches as interrupts, so the task queue
isn't doing much to reduce the "interrupt overhead" -- it just gives
context switches to the task instead of to an ithread.

>>> I think next server will be support PMC.
>>> Report from PMC still poorly?
>>
>> I should be adequate, but I prefer my version of perfmon which can
>> count cache misses precisely for every function.  But without patches,
>> perfmon is even more broken than high resolution kernel profiling.
>
> Can I use your version of perfmon? How? I don't have expirence with
> any kernel profiling.

Not the place to start.

>> [FILTER] means "fast".  re used to unconditionally use "fast" interrupts
>> and a task queue, which IMO is a bad way to program an interrupt
>> handler, but yongari@ recently overhauled re (again :-) so that it now
>> doesn't use fast interrupts by default for the MSI/MSIX case .  (BTW,
>
> Ineresting, but I don't think this help for this case -- old PCI
> chip, old CPU, old RAM, old matherboard -- all old. I don't try to get
> wirespeed gigabit performance from this old box, I try to compare
> relative performance FreeBSD vs Linux (in last time I got many
> feedback about poor network performance FreeBSD vs Linux).

Old hardware will certainly amplify any overheads.  50% overhead becomes
100% if the system is 2 times slower...

Bruce