FreeBSD Mail Archives

Date:      Sat, 29 Jan 2011 00:52:15 +0300
From:      Slawa Olhovchenkov <slw@zxy.spb.ru>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        freebsd-performance@freebsd.org, Julian Elischer <julian@freebsd.org>, Stefan Lambrev <stefan.lambrev@moneybookers.com>
Subject:   Re: Interrupt performance
Message-ID:  <20110128215215.GJ18170@zxy.spb.ru>
In-Reply-To: <20110129070205.Q7034@besplex.bde.org>
References:  <20110128143355.GD18170@zxy.spb.ru> <22E77EED-6455-4164-9115-BBD359EC8CA6@moneybookers.com> <20110128161035.GF18170@zxy.spb.ru> <CDBFAB7F-1EBC-4B3A-B2F5-6162DD58A93D@moneybookers.com> <4D42F87C.7020909@freebsd.org> <20110128172516.GG18170@zxy.spb.ru> <20110129070205.Q7034@besplex.bde.org>

index | next in thread | previous in thread | raw e-mail


On Sat, Jan 29, 2011 at 07:52:11AM +1100, Bruce Evans wrote:

> >> there are of course several possible answers, including:
> >>
> >> 1/ Sometimes BSD and Linux report things differently. Linux may or may not
> >> account for the lowest level interrupt tie the same as BSD
> >
> > But I see only 20% idle on FreeBSD and 80% idle on Linux.
> 
> The time must be counted somewhere, so when it is not properly accounted
> to packet handling, and nothing much else is running, it is accounted to
> idle.
> 
> To see how much CPU is actually available, run something else and see how
> fast it runs.  A simple counting loops works well on UP systems.

===
#include <stdio.h>
#include <sys/time.h>

int Dummy;

int
main(int argc, char *argv[])
{
 long int count,i,dt;
 struct timeval st,et;

 count = atol(argv[1]);

 gettimeofday(&st, NULL);
 for(i=count;i;i--) Dummy++;
 gettimeofday(&et, NULL);
 dt = (et.tv_sec-st.tv_sec)*1000000 + et.tv_usec-st.tv_usec;
 printf("Elapsed %d us\n",dt);
}
===

This is ok?

./loop 2000000000

FreeBSD
1 process: Elapsed 7554193 us
2 process: Elapsed 14493692 us
netperf + 1 process: Elapsed 21403644 us

Linux
1 process: Elapsed 7524843 us
2 process: Elapsed 14995866 us
netperf + 1 process: Elapsed 14107670 us

> >> 2/ the BSD driver for that chip may be badly written, or may
> >> be doing more or different work for some reason
> >> 3/ the FreeBSD interrupt code may be misconfigured for that driver.
> >>
> >> or maybe combinations...
> 
> Possibly, but it's a low-end NIC and those normally take a lot of CPU.
> 128 kpps might take 20% of 1 3GHz CPU for even a high-end NIC on FreeBSD.
> Linux has generally lower overheads and should be expected to reduce this
> a bit, to perhaps as low as 15%, depending on how much of the overhead is
> due to the NIC.
> 
> >> there are profiling tools that you may decide to run.
> >
> > What tools I can use on amd64?
> >
> > I boot kernel configured with 'config -p'.
> > Most time in spinlock_exit and acpi_cpu_c1.
> 
> Normal profiling works poorly (I see you found my old mail about high
> resolution profiling).  Linux might be misreporting the overhead for

I think next server will be support PMC.
Report from PMC still poorly?

> exactly the same reasons that normal profiling works poorly:
> - the profiling clock frequency of ~1 KHz was adequate for 5 MHz machines
>    in 1998, but is now too slow.  Statistics clocks are even slower (128
>    Hz in FreeBSD, and possibly 100 Hz (?) jiffies in Linux).
> - the statistics clock might be too synchronized with other interrupts.
>    The above spinlock_exit and acpi_cpu_c1 times indicate that the
>    statistics clock almost always fires on exit from another spinlock
>    and/or inside ACPI, for waking up from idle for the latter.  Seeing
>    lots of exits from spinlocks may indicated that spinlocks are being
>    used too much.
> But FreeBSD will report interrupt times and system for non-fast-interrupts
> to an accuracy of about 1 microsecond, since it doesn't use the
> statistics clock much for this.  OTOH, for fast interrupts it is typical
> behaviour in FreeBSD and Linux to not see them at all from the statistics
> clock interrupt, since they mask all interrupts so they mask the
> statistics clock interrupt in particular.  In FreeBSD, lots of time
> apparently spent in spinlock_exit is a typical result of this, or at
> least similar things, since spinlock_enter masks all interrupts (except
> in my version of course).  Linux doesn't have fast interrupts in the
> same way that FreeBSD does, but at least in old versions almost all of
> its interrupts masked other interrupts a lot.

Linux kernel 2.6.26.

> >>> On Jan 28, 2011, at 6:10 PM, Slawa Olhovchenkov wrote:
> >>>
> >>>> On Fri, Jan 28, 2011 at 06:03:15PM +0200, Stefan Lambrev wrote:
> >>>>
> >>>>> Do the test with netblast ;)
> >>>>> Most perf tools are written badly and for Linux.
> >>>>> In our internal test netblast running on freebsd outperform everything else.
> >>>> I don't speak about bad performance.
> >>>> I speak about overhead.
> >>>>
> >>>> Linux: overhead 7% for 56K int/s
> >>>> FreeBSD: overhead 59% for 14K int/s
> >>>>
> >>>> For processing 1/4 interrupts FreeBSD need 8x CPU.
> 
> You showed context switches in another reply.  56k interrupts on FreeBSD
> would give at least 112k context switches taking several uSec each to do
> nothing except switch.  This would give an overhead in the 59% range.
> 14K is not so bad, but still too high unless you have a spare CPU or 32
> to handle it.  Part of the lowness of low-end NICs is that they tend to
> generate too many interrupts and don't have much or any way to control
> this.  Linux will certainly be about to handle 56K int/S better than
> FreeBSD since it doesn't have heavyweight interrupt threads AFAIK.
> FreeBSD also has "fast" interrupts, which are much like normal interrupts
> used to be in FreeBSD.  I don't know if your NIC driver uses these.  I

re0: [FILTER]

I think this is answer ([FILTER]), but I don't understand this answer :).

> guess not, since if it did then it should move the "interrupt" processing
> to a task queue, where it would show up under another label and be reduced
> insignificantly.
> 
> >>>>> P.S. - /usr/src/tools/tools/netrate/netblast - we have tested little more expensive card - em/igb and bce.
> 
> netblast should be able to saturate a low-end NIC, but may take 100% of 1
> CPU to do so (it has to busy-wait, since there is no way to select() on
> the NIC ring being unfull, and timeouts don't work either since their
> granularity is too large).  If the NIC activity alone saturates 1 CPU,
> then you might see the 100% CPU being shown for Linux too.

> >>>>>> re0:<RealTek 8169SC/8110SC Single-chip Gigabit Ethernet>  port 0x4000-0x40ff mem 0xf0100000-0xf01000ff irq 19 at device 4.0 on pci11
> >>>>>> re0: Chip rev. 0x18000000
> >>>>>> re0: MAC rev. 0x00000000
> >>>>>> miibus0:<MII bus>  on re0
> >>>>>> rgephy0:<RTL8169S/8110S/8211B media interface>  PHY 1 on miibus0
> 
> I don't really know if this is low-end, but guess all RealTeks are :-).

FreeBSD support interrupt moderation on this chip, and chip support
TOE :)

> >>>>>> CPU: Intel(R) Celeron(R) CPU          420  @ 1.60GHz (1596.05-MHz K8-class CPU)
> 
> This is low end :-).
> 
> I mostly use old AthlonXP and Athlon64 2GHz systems for network testing,
> These are a bit faster than the above.  A single medium end bge (5701)
> on a PCI33 bus takes 100% CPU at about 512 kpps.  A single low end bge
> (5705+) on a PC1333 takes 120% CPU at about 240 kpps on a 2-core system.
> Linux-2.6.10 saturates well below 512 kpps on the same hardware.
> 
> Bruce

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110128215215.GJ18170>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation