From owner-freebsd-stable@FreeBSD.ORG Sun Nov 11 23:59:47 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 691A916A469 for ; Sun, 11 Nov 2007 23:59:47 +0000 (UTC) (envelope-from christias@gmail.com) Received: from mu-out-0910.google.com (mu-out-0910.google.com [209.85.134.187]) by mx1.freebsd.org (Postfix) with ESMTP id D667013C4A5 for ; Sun, 11 Nov 2007 23:59:45 +0000 (UTC) (envelope-from christias@gmail.com) Received: by mu-out-0910.google.com with SMTP id i10so1456355mue for ; Sun, 11 Nov 2007 15:59:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=JQcRAnMTL3Rz7M3lrgk+jiTAxrt0Wg/so46CLPEzFDY=; b=uVQ4Fclr3T3BQLaDwuhQ+zEfsTGl1KVbcwtrQpcPSvTYb87u1VmLhyGZUqFV28y8EiEIQdH8YuoyF+bQPRV3F3DP/f5tF0rMDrpZ92FgRXARPNORkA5N5qvJHbPE5hQJCRuxP/sD3MowWFM5E+OGMkzpEUL6SVHAVoEDkeJe3Wg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=JbXoWEHX8IzdXp0DQoJSosFxK0594SWk1VinvBUD7KXNY1loyQwP9A4pW8pejIJRdE7SZI/D1EWTjDNnhCjxe68vvE8cV3XPxOR5gITmiqjAgQqr1Mi2SCHJw/HXq9XDcZY66ZA3gUutNDVsQb5rK4G8Pp3qYSBrC8hwR/0V1Bc= Received: by 10.86.73.17 with SMTP id v17mr3999518fga.1194823903692; Sun, 11 Nov 2007 15:31:43 -0800 (PST) Received: by 10.86.53.6 with HTTP; Sun, 11 Nov 2007 15:31:43 -0800 (PST) Message-ID: Date: Mon, 12 Nov 2007 01:31:43 +0200 From: "Panagiotis Christias" To: "Alexey Popov" In-Reply-To: <47373B43.9060406@chistydom.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <47137D36.1020305@chistydom.ru> <47149E6E.9000500@chistydom.ru> <4715035D.2090802@FreeBSD.org> <4715C297.1020905@chistydom.ru> <4715C5D7.7060806@FreeBSD.org> <471EE4D9.5080307@chistydom.ru> <4723BF87.20302@FreeBSD.org> <47344E47.9050908@chistydom.ru> <47349A17.3080806@FreeBSD.org> <47373B43.9060406@chistydom.ru> Cc: freebsd-hackers@freebsd.org, Kris Kennaway , freebsd-stable@freebsd.org Subject: Re: amrd disk performance drop after running under high load X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Nov 2007 23:59:47 -0000 On Nov 11, 2007 7:26 PM, Alexey Popov wrote: > Hi. > > Kris Kennaway wrote: > >>> In the "good" case you are getting a much higher interrupt rate but > >>> with the data you provided I can't tell where from. You need to run > >>> vmstat -i at regular intervals (e.g. every 10 seconds for a minute) > >>> during the "good" and "bad" times, since it only provides counters > >>> and an average rate over the uptime of the system. > >> > >> Now I'm running 10-process lighttpd and the problem became no so big. > >> > >> I collected interrupt stats and it shows no relation beetween > >> ionterrupts and slowdowns. Here is it: > >> http://83.167.98.162/gprof/intr-graph/ > >> > >> Also I have similiar statistics on mutex profiling and it shows > >> there's no problem in mutexes. > >> http://83.167.98.162/gprof/mtx-graph/mtxgifnew/ > >> > >> I have no idea what else to check. > > > I don't know what this graph is showing me :) When precisely is the > > system behaving poorly? > Take a look at "Disk Load %" picture at > http://83.167.98.162/gprof/intr-graph/ > > At ~ 17:00, 03:00-04:00, 13:00-14:00, 00:30-01:30, 11:00-13:00 it shows > peaks of disk activity which really never happen. As I said in the > beginning of the thread in this "peak" moments disk becomes slow and > vmstat shows 100% disk load while performing < 10 tps. Other grafs at > this page shows that there's no relation to interrupts rate of amr or em > device. You advised me to check it. > > When I was using single-process lighttpd the problem was much harder as > you can see at http://83.167.98.162/gprof/graph/ . At first picture on > this page you can see disk load peaks at 18:00 and 15:00 which leaded to > decreasing network output because disk was too slow. > > Back in this thread we suspected UMA mutexes. In order to check it I > collected mutex profiling stats and draw graphs over time and they also > didn't show anything interesting. All mutex graphs were smooth while > disk load peaks. http://83.167.98.162/gprof/mtx-graph/mtxgifnew/ > > With best regards, > Alexey Popov Hello, what is your RAID controller configuration (read ahead/cache/write policy)? I have seen weird/bogus numbers (~100% busy) reported by systat -v when read ahead was enabled on LSI/amr controllers. Regards, Panagiotis Christias