Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Aug 2010 15:43:43 -0600
From:      markham breitbach <markham_breitbach@ssimicro.com>
To:        Julian Elischer <julian@elischer.org>
Cc:        freebsd-performance@freebsd.org
Subject:   Re: massive load average spikes
Message-ID:  <4C63198F.4040003@ssimicro.com>
In-Reply-To: <4C630156.6060203@elischer.org>
References:  <4C62D827.2030409@ssimicro.com>	<949C0FF2-04AA-4440-82B0-F44A13B8F0C2@mac.com> <4C62F272.4030703@ssimicro.com> <4C630156.6060203@elischer.org>

index | next in thread | previous in thread | raw e-mail


> load average is a time averaged thing and in the case of a
> 'thundering herd' problem you will see the LA spike up and
> come down again over time.
>
> Do you see any problem as a result of this? Or is it just curiosity?
>
> you might want to use KTR or ktrace with scheduling events if you
> really want to see the reason for this. It could just be a sampling
> error when some 'tick' coincides with the sampling..
>
>
I have not seen any noticeable performance degradation when the LA spikes like this, and
the main nuisance of this was Sendmail's behaviour.  I have since set the options
"RefuseLA=0" and "QueueLA=0" to avoid long stretches of SMTP being unavailable while the
load averaged itself out.

At this point it is really just a nagging feeling that something is misbehaving and it's
going to bite me when I least expect it (it always does!), so I would like to try and
track down the source of the problems, but I'm not even sure where to begin looking. 

I have run some ktrace on sendmail and dovecot, but did not see anything that stood out,
although I don't really know if I would recognize the problem in a kdump anyway (Too much
information!)  I'm not at all familiar with KTR, however.  Is this something that can be
run on a production host or should it be isolated to a dev box?  I have cloned the jail
into a dev environment on identical hardware, but only see the issue under production. 
I'm not sure if this is a factor of insufficient load or just not enough random
strangeness outside of production. 

Any suggestions for how KTR might help pin this down or what to look for?


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C63198F.4040003>