From owner-freebsd-performance@FreeBSD.ORG  Mon Oct 25 13:56:04 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id E105916A4CE; Mon, 25 Oct 2004 13:56:03 +0000 (GMT)
Received: from note.orchestra.cse.unsw.EDU.AU (note.orchestra.cse.unsw.EDU.AU
	[129.94.242.24])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id C5CE743D2D; Mon, 25 Oct 2004 13:56:02 +0000 (GMT)
	(envelope-from lukem@cse.unsw.edu.au)
Received: From wagner With LocalMail ; Mon, 25 Oct 2004 23:55:59 +1000 
From: lukem.freebsd@cse.unsw.edu.au
Sender: lukem@cse.unsw.edu.au
To: Robert Watson <rwatson@freebsd.org>
Date: Mon, 25 Oct 2004 23:55:59 +1000 (EST)
In-Reply-To: <Pine.NEB.3.96L.1041025095504.91675E-100000@fledge.watson.org>
Message-ID: <Pine.LNX.4.61.0410252328200.24277@wagner.orchestra.cse.unsw.EDU.AU>
References: <Pine.NEB.3.96L.1041025095504.91675E-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
cc: freebsd-performance@freebsd.org
Subject: Re: CPU utilisation cap?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Oct 2004 13:56:04 -0000

On Mon, 25 Oct 2004, Robert Watson wrote:
> A couple of thoughts, none of which points at any particular red flag, but
> worth thinking about:
>
> - You indicate their are multiple if_em cards in the host -- can you
>  describe the network topology?  Are you using multiple cards, or just
>  one of the nicely equipped ones?  Is there a switch involved, or direct
>  back-to-back wires?

I have 4 linux boxes (don't blame me!) generating udp traffic through an 
8-port HP procurve gigabit switch. The software I am using is ipbench (see 
ipbench.sourceforge.net for details).

I am presently only using one NIC at a time, though I intend to measure 
routing performance soon which will eliminate the user component (and 
hopefully some possible scheduler effects), and might shed more light on 
things.

> - Are the packet sources generating the packets synchronously or
>  asynchronously: i.e., when a packet source sends a UDP packet, does it
>  wait for the response before continuing, or keep on sending?   If
>  synchronously, are you sure that the wires are being kept busy?

It is not a ping-pong benchmark. The traffic is generated continuously, 
each packet is timestamped as it is sent, and the timestamp is compared 
with the receipt time to get a round trip time (I didn't bother including 
the latency information in my pust to freebsd-net).

> - Make sure your math on PCI bus bandwidth accounts for packets going in
>  both directions if you're actually echoing the packets.  Also make sure
>  to include the size of the ethernet frame and any other headers.

The values I have quoted (550Mbit etc.) are throughput for what is 
received at the linux box after echoing. Therefore we can expect double 
that on the PCI bus, plus overheads.

> - If you're using SCHED_ULE, be aware that it's notion of "nice" is a
>  little different from the traditional UNIX notion, and attempts to
>  provide more proportional CPU allocation.  You might try switching to
>  SCHED_4BSD.  Note that there have been pretty large scheduler changes in
>  5.3, with a number of the features that were previously specific to
>  SCHED_ULE being made available with SCHED_4BSD, and that a lot of
>  scheduling bugs have been fixed.  If you move to 5.3, make sure you run
>  with 4BSD, and it would be worth trying it with 5.2 to "see what
>  happens".

I very strongly agree that it sounds like a scheduling effect. The 5.2.1 
kernel (which is what I am using) is built with SCHED_4BSD already. It 
will be interesting to see if the new scheduler makes a difference.

> - It would be worth trying the test without the soaker process but instead
>  a sampling process that polls the kernel's notion of CPU% measurement
>  every second.  That way if it does turn out that ULE is unecessarily
>  giving CPU cycles to the soaker, you can still measure w/o "soaking".
>
> - What does your soaker do -- in particular, does it make system calls to
>  determine the time frequently?  If so, the synchronization operations
>  and scheduling cost associated with that may impact your measurements.
>  If it just spins reading the tsc and outputting once in a while, you
>  should be OK WRT this point.

I will look into this. I didn't write the code so I'm not sure what it 
does exactly. From what I understand it uses a calibrated tight loop, so 
it shouldn't need to do any syscalls while it is running, but I will check 
it out anyway. I have been considering implementing this using a cycle 
count register, but have avoided it for portability so far.

> - Could you confirm using netstat -s statistics that a lot of your packets
>  aren't getting dropped due to full buffers on either send or receive.
>  Also, do you have any tests in place to measure packet loss?  Can you
>  confirm that all the packets you send from the Linux boxes are really
>  sent, and that given they are sent, that they arrive, and vice versa on
>  the echo?  Adding sequence numbers and measuring the mean sequence
>  number difference might be an easy way to start if you aren't already.

I get numbers for both the packets transmitted and the packets received 
(albeit in terms of throughputs). What I see is little-to-no packet loss 
below the MLFRR (maximum loss free receive rate), and obviously packets 
get lost after that.

Thanks for the help!

-- 
Luke