From owner-freebsd-stable@FreeBSD.ORG  Tue Oct 16 11:23:48 2007
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6C90216A418;
	Tue, 16 Oct 2007 11:23:48 +0000 (UTC)
	(envelope-from lol@chistydom.ru)
Received: from hermes.hw.ru (hermes.hw.ru [80.68.240.91])
	by mx1.freebsd.org (Postfix) with ESMTP id 2B4D313C467;
	Tue, 16 Oct 2007 11:23:45 +0000 (UTC)
	(envelope-from lol@chistydom.ru)
Received: from [80.68.244.40] (account a_popov@rbc.ru [80.68.244.40] verified)
	by hermes.hw.ru (CommuniGate Pro SMTP 5.0.13)
	with ESMTPA id 194370688; Tue, 16 Oct 2007 15:21:06 +0400
Message-ID: <47149E6E.9000500@chistydom.ru>
Date: Tue, 16 Oct 2007 15:20:14 +0400
From: Alexey Popov <lol@chistydom.ru>
User-Agent: Thunderbird 2.0.0.6 (X11/20070924)
MIME-Version: 1.0
To: Kris Kennaway <kris@FreeBSD.org>
References: <47137D36.1020305@chistydom.ru> <47140906.2020107@FreeBSD.org>
	<47146FB4.6040306@chistydom.ru> <47147E49.9020301@FreeBSD.org>
In-Reply-To: <47147E49.9020301@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: amrd disk performance drop after running under high load
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2007 11:23:48 -0000

Hi.

Kris Kennaway wrote:

>>>> After some time of running under high load disk performance become 
>>>> expremely poor. At that periods 'systat -vm 1' shows something like
>>>> this:
>>> What does "high load" mean?  You need to explain the system workload 
>>> more.
>> This web service is similiar to YouTube. This server is video store. I
>> have around 200G of *.flv (flash video) files on the server.
>>
>> I run lighttpd as a web server. Disk load is usually around 50%, network
>> output 100Mbit/s, 100 simultaneous connections. CPU is mostly idle.
>>
>> As you can see it is a trivial service - sending files to network via 
>> HTTP.
> Does lighttpd actually use HTTP accept filters?
Don't know how to make sure, but is seems to run appropriate setsockopt 
(truss output):

setsockopt(0x4,0xffff,0x1000,0x7fffffffe620,0x100) = 0 (0x0)

> Are you using ipfilter and ipfw?  You are paying a performance penalty 
> for having them.
I'm using ipfw and one of the first rules is to pass all TCP 
established. ipfilter is not used on this server, but it is present in 
kernel as it can be used on other servers. I have 95% CPU idle, so I 
think packet filters does not produce significant load on this server.

> You might try increasing BUCKET_MAX in sys/vm/uma_core.c.  I don't 
> really understand the code here, but you seem to be hitting a threshold 
> behaviour where you are constantly running out of space in the per CPU 
> caches.
Thanks, I'll try this.

> This can happen if your workload is unbalanced between the CPUs and you 
> are always allocating on one but freeing on another, but I wouldn't 
> expect it should happen on your workload.  Maybe it can also happen if 
> your turnover is high enough.  
This is very unlikely, because I have 5 another video storage servers of 
the same hardware and software configurations and they feel good.

On the other side, all other servers were put in production before or 
after problematic servers and were filled with content in the other ways 
and therefore they could have slightly differerent load pattern.

Totally I faced this bug three times:

1. The first time there was AFAIR 5.4-RELEASE on DELL 2850 with the same 
configuration as now. It was mp3 store and I used thttpd as HTTP server 
to serve mp3's. That time the problems were not so frequent and also it 
took too long to get back to normal operation so we had to reboot 
servers once a week or so.

The problems began when we moved to new hardware - Dell 2850. That time 
we suspected amrd driver and had no time to dig in, bacause all the 
servers of the project were problematic. Installing Linux helped.

2. The second time it was server for static files of the very popular 
blog. The http server was nginx and disk contented puctures, mp3's and 
videos. It was Dell 1850 2x146 SCSI mirror. Linux also solved the problem.

3. The problem we see now.

At first glance one can say that problem is in Dell's x850 series or 
amr(4), but we run this hardware on many other projects and they work 
well. Also Linux on them works.

And few hours ago I received feed back from Andrzej Tobola, he has the 
same problem on FreeBSD 7 with Promise ATA software mirror:

===
Subject: Re: amrd disk performance drop after running under high load
Date: Tue, 16 Oct 2007 10:59:34 +0200
From: Andrzej Tobola <ato@???>
To: Alexey Popov <lol@???>

<skip>

Exactly the same here but on big ata RAID0 with big trafic (~10GB/24h):

amper% df -h /ftp/priv
Filesystem    Size    Used   Avail Capacity  Mounted one
/dev/ar0a     744G    679G    4.7G    99%    /ftp/priv

amper% grep ^ar /var/run/dmesg.boot
ar0: 763108MB <Promise Fasttrak RAID0 (stripe 64 KB)> status: READY
ar0: disk0 READY using ad6 at ata3-master
ar0: disk1 READY using ad4 at ata2-master

amper% uname -a
FreeBSD xxx 7.0-CURRENT-200709 FreeBSD
7.0-CURRENT-200709 #0: Tue Sep 11 04:44:48 UTC 2007 
root@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  i386

I am rebooting if I reach this state (approx. a week).
It is old bug - a few months ;)

cheers,
-a

===

So I can conclude that FreeBSD has a long standing bug in VM that could 
be triggered when serving large amount of static data (much bigger than 
memory size) on high rates. Possibly this only applies to large files 
like mp3 or video.

> What does vmstat -z show during the good and bad times?
I'll send this data when the bad times will happen next time.

With best regards,
Alexey Popov