From owner-freebsd-performance@FreeBSD.ORG  Sun Dec 26 02:19:42 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 569EA16A4CE; Sun, 26 Dec 2004 02:19:42 +0000 (GMT)
Received: from smtp.uol.com.br (smtpout1.uol.com.br [200.221.4.192])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 2C1CE43D1F; Sun, 26 Dec 2004 02:19:41 +0000 (GMT)
	(envelope-from jonny@jonny.eng.br)
Received: from [200.164.27.103] (200164027103.user.veloxzone.com.br
	[200.164.27.103])
	by scorpion1.uol.com.br (Postfix) with ESMTP id 79E8D774E;
	Sun, 26 Dec 2004 00:19:32 -0200 (BRST)
Message-ID: <41CE1FB5.4080401@jonny.eng.br>
Date: Sun, 26 Dec 2004 00:19:33 -0200
From: =?ISO-8859-1?Q?Jo=E3o_Carlos_Mendes_Lu=EDs?= <jonny@jonny.eng.br>
User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Robert Watson <rwatson@freebsd.org>
References: <Pine.NEB.3.96L.1041225121903.27724E-100000@fledge.watson.org>
In-Reply-To: <Pine.NEB.3.96L.1041225121903.27724E-100000@fledge.watson.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
cc: Jeff Behl <jbehl@fastclick.com>
cc: freebsd-performance@freebsd.org
cc: freebsd-net@freebsd.org
Subject: Re: %cpu in system - squid performance in FreeBSD 5.3
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 26 Dec 2004 02:19:42 -0000

Robert Watson wrote:
> On Thu, 23 Dec 2004, Jeff Behl wrote:
> 
>>As a follow up to the below (original message at the very bottom), I
>>installed a load balancer in front of the machines which terminates the
>>tcp connections from clients and opens up a few, persistent connections
>>to each server over which requests are pipelined.  In this scenario
>>everything is copasetic: 
> 
> I'm not very familiar with Squid's architecture, but I would anticipate
> that what you're seeing is that the cost of additional connections served
> in parallel is pretty high due to the use of processes.  Specifically: if
> each TCP connection being served gets its own process, and there are a lot
> of TCP connections, you'll be doing a lot of process forking, context
> switching, exceeding cache sizes, etc.  With just a couple of connections,
> even if they're doing the same "work", the overhead is much lower. 
> Depending on how much time you're willing to invest in this, we can
> probably do quite a bit to diagnose where the cost is coming from and look
> for any specific problems or areas we could optimize.

     It must not be this.  Squid is mostly a single process system, with 
scheduling based on descriptors and select/poll.  Recent versions added 
some parallelism in other processes, but just for file reading/writing 
(diskd) and regular expression processing for ACLs.  Even DNS, which 
previously ran on blocking I/O in secondary processes now run internally 
in the select/poll scheduler.

     I also have some experience in older versions of squid, in which 
the same machine running the same version of squid, and changing Linux 
for FreeBSD raised the maximum simultaneus conection limit.

> I might start by turning on kernel profiling and doing a profile dump
> under load.  Be aware that turning on profiling uses up a lot of CPU
> itself, so will reduce the capacity of the system.  There's probably
> documentation elsewhere, but the process I use to set up profiling is
> here:

     I did not make any tests on this, but I would expect profiling to 
fail, since every step of the scheduler is very small, and deals with 
the smallest I/O available at that time.

     Indeed, based on the original report I would search for some 
optimization on descriptor searching in poll or select, whichever squid 
has chosen to use on FreeBSD (probably select, looking at the top 
output).  This is one of the crucial points on squid performance.  The 
other one is disk access, for sure, but the experimente describe would 
not change disk access patterns, would it?

>   http://www.watson.org/~robert/freebsd/netperf/profile/
> 
> Note that it warns the some results may be incorrect on SMP.  I think it
> would be useful to give it a try anyway just to see if we get something
> useful.

     As I said before, beeing a single process scheduler, squid does not 
gain much from SMP.  The secondary processes would benefit from the 
extra CPU, though.  Maybe interrupt processing also, if the giant lock 
does not interfere in any part of the processing path.

> As a final question: other than CPU consumption, do you have a reliable
> way to measure how efficiently the system is operating -- in particular,
> how fast it is able to serve data?  Having some sort of metric for
> performance can be quite useful in optimizing, as it can tell us whether

     One thing I fail to measure in FreeBSD is the reason for delays in 
disk access times.  How can I prove that the delay is on disk, and 
determine how to optimize it?  systat -v is very useful, but does not 
give me all answers.

>>last pid:  3377;  load averages:  0.12,  0.09,  0.08
>>up 0+17:24:53  10:02:13
>>31 processes:  1 running, 30 sleeping
>>CPU states:  5.1% user,  0.0% nice,  1.8% system,  1.2% interrupt, 92.0%
>>idle
>>Mem: 75M Active, 187M Inact, 168M Wired, 40K Cache, 214M Buf, 1482M Free
>>Swap: 4069M Total, 4069M Free
>>
>>  PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU
>>COMMAND
>>  474 squid     96    0 68276K 62480K select 0  53:38 16.80% 16.80%
>>squid
>>  311 bind      20    0 10628K  6016K kserel 0  12:28  0.00%  0.00%
>>named


                                         Jonny

-- 
João Carlos Mendes Luís - Networking Engineer - jonny@jonny.eng.br

From owner-freebsd-performance@FreeBSD.ORG  Sun Dec 26 07:14:16 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id A480216A4CE; Sun, 26 Dec 2004 07:14:16 +0000 (GMT)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 3FEBD43D2D; Sun, 26 Dec 2004 07:14:16 +0000 (GMT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (localhost [127.0.0.1])
	by fledge.watson.org (8.13.1/8.13.1) with ESMTP id iBQ7AxKU055168;
	Sun, 26 Dec 2004 02:10:59 -0500 (EST)
	(envelope-from robert@fledge.watson.org)
Received: from localhost (robert@localhost)iBQ7Axnk055164;
	Sun, 26 Dec 2004 07:10:59 GMT
	(envelope-from robert@fledge.watson.org)
Date: Sun, 26 Dec 2004 07:10:59 +0000 (GMT)
From: Robert Watson <rwatson@freebsd.org>
X-Sender: robert@fledge.watson.org
To: =?ISO-8859-1?Q?Jo=E3o_Carlos_Mendes_Lu=EDs?= <jonny@jonny.eng.br>
In-Reply-To: <41CE1FB5.4080401@jonny.eng.br>
Message-ID: <Pine.NEB.3.96L.1041226070126.45272F-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
cc: Jeff Behl <jbehl@fastclick.com>
cc: freebsd-performance@freebsd.org
cc: freebsd-net@freebsd.org
Subject: Re: %cpu in system - squid performance in FreeBSD 5.3
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 26 Dec 2004 07:14:16 -0000


On Sun, 26 Dec 2004, Jo=E3o Carlos Mendes Lu=EDs wrote:

>      It must not be this.  Squid is mostly a single process system, with=
=20
> scheduling based on descriptors and select/poll.  Recent versions added=
=20
> some parallelism in other processes, but just for file reading/writing=20
> (diskd) and regular expression processing for ACLs.  Even DNS, which=20
> previously ran on blocking I/O in secondary processes now run internally=
=20
> in the select/poll scheduler.

Thanks for this information.

> > I might start by turning on kernel profiling and doing a profile dump
> > under load.  Be aware that turning on profiling uses up a lot of CPU
> > itself, so will reduce the capacity of the system.  There's probably
> > documentation elsewhere, but the process I use to set up profiling is
> > here:
>=20
>      I did not make any tests on this, but I would expect profiling to
> fail, since every step of the scheduler is very small, and deals with
> the smallest I/O available at that time.=20

This is kernel profiling, not application profiling, and would hopefully
give us information on where the kernel was spending most of its time,
since in the environment in question system time appears to be dominant.=20
If SMP in theory makes little difference to Squid performance, then
switching to a UP kernel may well make kernel profiling more reliable and
hence more useful in tracking systemn time.

>      Indeed, based on the original report I would search for some
> optimization on descriptor searching in poll or select, whichever squid
> has chosen to use on FreeBSD (probably select, looking at the top
> output).  This is one of the crucial points on squid performance.  The
> other one is disk access, for sure, but the experimente describe would
> not change disk access patterns, would it?=20

The reporter described a very high percentage of system time -- time spent
blocked on disk I/O isn't billed to system time; if spending lots of time
waiting on disk I/O for a single process, you'd see idle time rather than
system time predominating, I believe.

> > As a final question: other than CPU consumption, do you have a reliable
> > way to measure how efficiently the system is operating -- in particular=
,
> > how fast it is able to serve data?  Having some sort of metric for
> > performance can be quite useful in optimizing, as it can tell us whethe=
r
>=20
>      One thing I fail to measure in FreeBSD is the reason for delays in
> disk access times.  How can I prove that the delay is on disk, and
> determine how to optimize it?  systat -v is very useful, but does not
> give me all answers.=20

I'm not sure there are useful summary tools at a system-wide level for
this, but it is possible to use KTR(9) to trace the associated scheduler
and disk events.  In particular, I recently added high level tracing of
g_down and g_up GEOM events to KTR.  Jeff Roberson is about to commit a
scheduler visualization tool that interprets KTR events relating to the
scheduler that may also be useful.  It would certainly be extremely useful
to have a tool for normal system operation that could be pointed at a
process to say "show me the percent of time spent on various wait channels
for pid 50".  ktrace(1) has the ability to track context switches but
appears not to provide enough information to figure out why the context
switch took place currently.  I'll investigate this in the next couple of
days -- the trick is to gather this sort of statistic without too much
additional overhead.  If that's not easily possible, then simply
post-processing KTR may be the right approach.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Principal Research Scientist, McAfee Research