Date: Wed, 11 Aug 2010 15:43:43 -0600 From: markham breitbach <markham_breitbach@ssimicro.com> To: Julian Elischer <julian@elischer.org> Cc: freebsd-performance@freebsd.org Subject: Re: massive load average spikes Message-ID: <4C63198F.4040003@ssimicro.com> In-Reply-To: <4C630156.6060203@elischer.org> References: <4C62D827.2030409@ssimicro.com> <949C0FF2-04AA-4440-82B0-F44A13B8F0C2@mac.com> <4C62F272.4030703@ssimicro.com> <4C630156.6060203@elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> load average is a time averaged thing and in the case of a > 'thundering herd' problem you will see the LA spike up and > come down again over time. > > Do you see any problem as a result of this? Or is it just curiosity? > > you might want to use KTR or ktrace with scheduling events if you > really want to see the reason for this. It could just be a sampling > error when some 'tick' coincides with the sampling.. > > I have not seen any noticeable performance degradation when the LA spikes like this, and the main nuisance of this was Sendmail's behaviour. I have since set the options "RefuseLA=0" and "QueueLA=0" to avoid long stretches of SMTP being unavailable while the load averaged itself out. At this point it is really just a nagging feeling that something is misbehaving and it's going to bite me when I least expect it (it always does!), so I would like to try and track down the source of the problems, but I'm not even sure where to begin looking. I have run some ktrace on sendmail and dovecot, but did not see anything that stood out, although I don't really know if I would recognize the problem in a kdump anyway (Too much information!) I'm not at all familiar with KTR, however. Is this something that can be run on a production host or should it be isolated to a dev box? I have cloned the jail into a dev environment on identical hardware, but only see the issue under production. I'm not sure if this is a factor of insufficient load or just not enough random strangeness outside of production. Any suggestions for how KTR might help pin this down or what to look for?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C63198F.4040003>