From owner-freebsd-hackers@FreeBSD.ORG  Tue Jun  8 17:21:08 2004
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7C18A16A4CE; Tue,  8 Jun 2004 17:21:08 +0000 (GMT)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id D292443D48; Tue,  8 Jun 2004 17:21:07 +0000 (GMT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (localhost [127.0.0.1])
	by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i58HK86O077539;
	Tue, 8 Jun 2004 13:20:08 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Received: from localhost (robert@localhost)i58HK8WM077536;
	Tue, 8 Jun 2004 13:20:08 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Date: Tue, 8 Jun 2004 13:20:08 -0400 (EDT)
From: Robert Watson <rwatson@FreeBSD.org>
X-Sender: robert@fledge.watson.org
To: Ali Niknam <ali@transip.nl>
In-Reply-To: <00bd01c44cb5$ccf5f840$0400a8c0@redguy>
Message-ID: <Pine.NEB.3.96L.1040608131347.75106A-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-hackers@FreeBSD.org
cc: John Baldwin <jhb@FreeBSD.org>
Subject: Re: FreeBSD 5.2.1: Mutex/Spinlock starvation?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2004 17:21:08 -0000


On Mon, 7 Jun 2004, Ali Niknam wrote:

> > There isn't a timeout.  Rather, the lock spins so long as the current
> > owning thread is executing on another CPU.
> 
> Interesting. Is there a way to 'lock' CPU's so that they always run on
> 'another' CPU ?
> 
> Unfortunately as we speak the server is down again :( This all makes me
> wonder wether I should simply go back to 4.10.

No one would blame you for backing off -CURRENT to -STABLE.  On the other
hand, having high workloads against -CURRENT is going to be critical to
identifying weaknesses in -CURRENT so we can improve them.  Unfortunately,
it's something of a chicken-and-egg problem...

> I decreased the maximum number of apache children to 1400 and the server
> seems to be barely holding on:
> last pid:  2483;  load averages: 75.77, 28.63, 11.40    up 0+00:04:32
> 19:35:07
> 1438 processes:2 running, 294 sleeping, 1142 lock
> CPU states:  6.2% user,  0.0% nice, 62.6% system,  7.5% interrupt, 23.8%
> idle
> Mem: 698M Active, 27M Inact, 209M Wired, 440K Cache, 96M Buf, 1068M Free
> Swap: 512M Total, 512M Free
> 
> Are there anymore quite stable things to do ? That is except for upping
> to current, which I frankly feel is too dangerous...

There are a number of known weaknesses in 5.2.1 that are resolved in
-CURRENT, but the update would also involve substantial risk as there's
some heavy moving going on in -CURRENT to improve network performance,
etc.  I haven't followed some of your system description in details, but
it seems like the primary thing to do right now, assuming you are still
able to keep 5.2.1 running on the box and are able to futz with the
configuration some, is to identify the specific source of the problem
you're experiencing.  Clearly, too much work is going on in the kernel. 
The question is, what work.  It's likely you're running into an expensive
edge case, it's possible it's resolved in HEAD, and it could be that a low
risk back port would resolve it.  It's also possible you're running into
an unresolved problem in HEAD.

The best case scenario from my perspective would be that you could provide
an equivilent workload against a test box where we could experiment with a
number of debugging settings, as well as simply trying -CURRENT...  It
sounds like we've tried some of the easy plugs, such as switching
schedulers, enabling adaptive mutexes, etc.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Senior Research Scientist, McAfee Research