Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Jun 2004 19:14:12 +0400 (MSD)
From:      Varshavchick Alexander <alex@metrocom.ru>
To:        Ali Niknam <ali@transip.nl>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: FreeBSD 5.2.1: Mutex/Spinlock starvation?
Message-ID:  <20040604190859.D98354@apache.metrocom.ru>
In-Reply-To: <00dd01c449b3$ca5a0f90$0400a8c0@redguy>
References:  <00dd01c449b3$ca5a0f90$0400a8c0@redguy>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Ali,

I can't say anything as how the issue can be connected with the mutexes
and so on, but to solve your problem with apache, I'd look into
'hold_off_on_exponential_spawning' and 'MAX_SPAWN_RATE' parameters in
src/main/http_main.c of the apache source tree (presuming you're using
apache 1.3.*), and I'm sure some similar options can be found for apache
2.0. What you need is to make apache forking rate more slower, so the
server will not suffer from a sudden load peak.

Just my $0.02 :)

----
Alexander Varshavchick, Metrocom Joint Stock Company
Phone: (812)118-3322, 118-3115(fax)

On Thu, 3 Jun 2004, Ali Niknam wrote:

> Hi Guys,
>
> First of all: this is my first posting in this group so please be gentil :)
>
> The other day I was upgrading a system from FreeBSD 4.5 single CPU to
> FreeBSD 5.2.1 dual CPU and I came across a terrible problem.
>
> The system is used as a rather busy webserver, with continuesly about 1200
> apache processes, and about 200 mysql pthreads.
>
> The problem i ran into is that when apache starts it needs to create a lot
> of childs quickly. When it does so at a given time, after about a minute or
> so, a couple of childs go into "Giant" status mode. After a few seconds more
> and more processes go into Giant mode up until the point that the system
> will become totally unresponsive (even for keyboard innput). The only remedy
> is to disconnect the utp and wait a few seconds; then kill everything.
>
> Now the nice part is: this happens only if i set apache's maxclients > 1250.
> Under 1250 the same scenario happens but after a minute or so the system
> recovers!
>
> Now i unfortunately do not know enough about the internals of BSD to do a
> very estimated guess, but i'll give a shot nevertheless: my estimate is that
> due to the tremendous amount of 'locked' processes the system simply starves
> of CPU to do anything. My guess is the Locking mechanism probably uses
> some kind of 'spin' to wait until the resource is unlocked (whichever
> resource it is, probably something network related, though).
>
> This is based upon the fact that this does not happen if you slightly
> decrease the number of apache's; what happens in that case is that the same
> scenario goes on; however after a minute or so the system recovers!
> (probably because it has just enough CPU to handle everything as apache
> hits its limit?)
>
> Now if this is indeed the case i was thinking of something like a sysctl
> MUTEX_BLOCK_THRESHOLD set to something like 50. If the system detects that
> the number of processes locked is higher than this number, then it stops
> 'spinning' for resources, but instead uses a 'blocking' mechanism (simply
> puts the processes in a 'waiting' queue).
>
> I would be very interested to hear what this problem could be; perhaps i can
> test a little if someone has solutions (i cant test much unfortunately,
> it's a production system).
>
> Best Regards,
> Ali Niknam
>
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040604190859.D98354>