From owner-freebsd-hackers@FreeBSD.ORG Mon Nov 19 20:14:18 2007 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 82FA616A41B; Mon, 19 Nov 2007 20:14:18 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from weak.local (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 1626E13C481; Mon, 19 Nov 2007 20:14:16 +0000 (UTC) (envelope-from kris@FreeBSD.org) Message-ID: <4741EE9E.9050406@FreeBSD.org> Date: Mon, 19 Nov 2007 21:14:22 +0100 From: Kris Kennaway User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Alexey Popov References: <47137D36.1020305@chistydom.ru> <47149E6E.9000500@chistydom.ru> <4715035D.2090802@FreeBSD.org> <4715C297.1020905@chistydom.ru> <4715C5D7.7060806@FreeBSD.org> <471EE4D9.5080307@chistydom.ru> <4723BF87.20302@FreeBSD.org> <47344E47.9050908@chistydom.ru> <47349A17.3080806@FreeBSD.org> <47373B43.9060406@chistydom.ru> <4739557A.6090209@chistydom.ru> In-Reply-To: <4739557A.6090209@chistydom.ru> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, Panagiotis Christias , freebsd-stable@freebsd.org Subject: Re: amrd disk performance drop after running under high load X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Nov 2007 20:14:18 -0000 Alexey Popov wrote: > Hi. > > Panagiotis Christias wrote: >>>>>> In the "good" case you are getting a much higher interrupt rate but >>>>>> with the data you provided I can't tell where from. You need to run >>>>>> vmstat -i at regular intervals (e.g. every 10 seconds for a minute) >>>>>> during the "good" and "bad" times, since it only provides counters >>>>>> and an average rate over the uptime of the system. >>>>> Now I'm running 10-process lighttpd and the problem became no so big. >>>>> >>>>> I collected interrupt stats and it shows no relation beetween >>>>> ionterrupts and slowdowns. Here is it: >>>>> http://83.167.98.162/gprof/intr-graph/ >>>>> >>>>> Also I have similiar statistics on mutex profiling and it shows >>>>> there's no problem in mutexes. >>>>> http://83.167.98.162/gprof/mtx-graph/mtxgifnew/ >>>>> >>>>> I have no idea what else to check. >>>> I don't know what this graph is showing me :) When precisely is the >>>> system behaving poorly? >> what is your RAID controller configuration (read ahead/cache/write >> policy)? I have seen weird/bogus numbers (~100% busy) reported by >> systat -v when read ahead was enabled on LSI/amr controllers. > > > ********************************************************************** > Existing Logical Drive Information > By LSI Logic Corp.,USA > > ********************************************************************** > [Note: For SATA-2, 4 and 6 channel controllers, please specify > Ch=0 Id=0..15 for specifying physical drive(Ch=channel, > Id=Target)] > > > Logical Drive : 0( Adapter: 0 ): Status: OPTIMAL > --------------------------------------------------- > SpanDepth :01 RaidLevel: 5 RdAhead : Adaptive Cache: DirectIo > StripSz :064KB Stripes : 6 WrPolicy: WriteBack > > Logical Drive 0 : SpanLevel_0 Disks > Chnl Target StartBlock Blocks Physical Target Status > ---- ------ ---------- ------ ---------------------- > 0 00 0x00000000 0x22ec0000 ONLINE > 0 01 0x00000000 0x22ec0000 ONLINE > 0 02 0x00000000 0x22ec0000 ONLINE > 0 03 0x00000000 0x22ec0000 ONLINE > 0 04 0x00000000 0x22ec0000 ONLINE > 0 05 0x00000000 0x22ec0000 ONLINE > > I tried to run with disabled Read-ahead, but it didn't help. I just ran into this myself, and apparently it can be caused by "Patrol Reads" where the adapter periodically scans the disks to look for media errors. You can turn this off using -stopPR with the megarc port. Kris