From owner-freebsd-questions@FreeBSD.ORG  Sat Aug  7 05:30:51 2004
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B540916A4CE
	for <freebsd-questions@freebsd.org>;
	Sat,  7 Aug 2004 05:30:51 +0000 (GMT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 3A4D043D4C
	for <freebsd-questions@freebsd.org>;
	Sat,  7 Aug 2004 05:30:51 +0000 (GMT)
	(envelope-from gemini@geminix.org)
Message-ID: <41146907.90809@geminix.org>
Date: Sat, 07 Aug 2004 07:30:47 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7) Gecko/20040629
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-questions@freebsd.org
References: <p06002017bd38da0c39a9@[192.168.0.5]>
In-Reply-To: <p06002017bd38da0c39a9@[192.168.0.5]>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BtJn7-0009Te-00; Sat, 07 Aug 2004 07:30:49 +0200
Subject: Re: identifying and fixing server I/O slowdowns
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Aug 2004 05:30:51 -0000

Jeff Kramer wrote:
> Oh great and wise FreeBSD gurus,
> 
> I've been running FreeBSD boxes for about five years with great results 
> (up to 6 at the moment), but recently one of my machines has started to 
> seriously act up.  Every time a heavy disk operation (say, tar'ing a 1 
> gig directory) occurs the system slows to a crawl, and requests to 
> apache/php/mysql sites hosted on it just hang.
> 
> The system is a dual p3 1.13ghz box with a gig of ram and mirrored 80 
> gig WD800BB drives on a Promise TX2 controller.  The raid isn't 
> degraded.  There's a dedicated 1.5 gig swap partition and a swap file on 
> the /usr partition.  We had some apache processes go nuts one time, 
> which is why I added the swap file.
> [...]

This problem could be due to a disk drive that is about to fail.  If 
there are (still recoverable) disk errors, retrying the affected I/O 
operations can keep a disk controller occupied for serveral seconds.  Of 
course, all processes trying to do disk I/O during this time span will 
block.

Since the errors are (eventually) recoverable the raid array is likely 
to _not_ drop into degraded mode by itself.  After you've found out 
which of the disks it is you would have to force that disk into failed 
mode and would then replace it.  The exact details depend on your raid 
controller.

Of course, your mileage may vary, but I've experienced disk failures 
like these several times in the past, with the effect you've described.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net