From owner-freebsd-stable@FreeBSD.ORG  Mon Oct 15 15:47:05 2007
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2524416A418;
	Mon, 15 Oct 2007 15:47:05 +0000 (UTC)
	(envelope-from lol@chistydom.ru)
Received: from hermes.hw.ru (hermes.hw.ru [80.68.240.91])
	by mx1.freebsd.org (Postfix) with ESMTP id 3A2B013C45D;
	Mon, 15 Oct 2007 15:47:02 +0000 (UTC)
	(envelope-from lol@chistydom.ru)
Received: from [80.68.244.40] (account a_popov@rbc.ru [80.68.244.40] verified)
	by hermes.hw.ru (CommuniGate Pro SMTP 5.0.13)
	with ESMTPA id 194160120; Mon, 15 Oct 2007 18:47:05 +0400
Message-ID: <47137D36.1020305@chistydom.ru>
Date: Mon, 15 Oct 2007 18:46:14 +0400
From: Alexey Popov <lol@chistydom.ru>
User-Agent: Thunderbird 2.0.0.6 (X11/20070924)
MIME-Version: 1.0
To: freebsd-stable@freebsd.org,  freebsd-hackers@freebsd.org
Content-Type: text/plain; charset=KOI8-R; format=flowed
Content-Transfer-Encoding: 7bit
Cc: 
Subject: amrd disk performance drop after running under high load
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 15:47:05 -0000

Hi.

I have 3 Dell 2850 with DELL PERC4 SCSI RAID5 6x300GB running lighttpd 
serving flash video at around 200Mbit/s.

%grep amr /var/run/dmesg.boot
amr0: <LSILogic MegaRAID 1.53> mem 
0xf80f0000-0xf80fffff,0xfe9c0000-0xfe9fffff irq 46 at device 14.0 on pci2
amr0: Using 64-bit DMA
amr0: delete logical drives supported by controller
amr0: <LSILogic PERC 4e/Di> Firmware 521X, BIOS H430, 256MB RAM
amr0: delete logical drives supported by controller
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 1430400MB (2929459200 sectors) RAID 5 (optimal)
Trying to mount root from ufs:/dev/amrd0s1a

%uname -a
FreeBSD ???.ru 6.2-STABLE FreeBSD 6.2-STABLE #2: Mon Oct  8 16:25:20 MSD 
2007     llp@???.ru:/usr/obj/usr/src/sys/SMP-amd64-HWPMC  amd64
%

After some time of running under high load disk performance become 
expremely poor. At that periods 'systat -vm 1' shows something like this:

Disks amrd0
KB/t  85.39
tps       5
MB/s   0.38
% busy   99

It shows 100% load and just 2-10 tps. There's nothing bad in 
/var/log/messages or 'netstat -m' or 'vmstat -z' or anywhere  else. This 
continues 15 - 30 minutes or so and everything becomes fine again. After 
some time - 10 - 12 hours it repeats.

Apart of all, I tried to make mutex profiling and here's the results 
(sorted by the total number of acquisitions):

Bad case:

  102 223514 273977 0 14689 1651568 /usr/src/sys/vm/uma_core.c:2349 (512)
  950 263099 273968 0 15004 14427 /usr/src/sys/vm/uma_core.c:2450 (512)
  108 150422 175840 0 10978 22988519 /usr/src/sys/vm/uma_core.c:1888 (mbuf)
  352 160635 173663 0 10896 9678 /usr/src/sys/vm/uma_core.c:2209 (mbuf)
  110 134910 173575 0 10838 9464 /usr/src/sys/vm/uma_core.c:2104 (mbuf)
  1104 1335319 106888 12 27 1259 /usr/src/sys/netinet/tcp_output.c:253 
(so_snd)
  171 77754 97685 0 176 207 /usr/src/sys/net/pfil.c:71 (pfil_head_mtx)
  140 77104 97685 0 151 128 /usr/src/sys/netinet/ip_fw2.c:164 (IPFW 
static rules)
  100 76543 97685 0 146 45450 /usr/src/sys/netinet/ip_fw2.c:156 (IPFW 
static rules)
  82 77149 97685 0 243 141221 /usr/src/sys/net/pfil.c:63 (pfil_head_mtx)
  1644 914481 97679 9 739 949977 
/usr/src/sys/contrib/ipfilter/netinet/fil.c:2320 (ipf filter load/unload 
mutex)
  1642 556643 97679 5 0 0 
/usr/src/sys/contrib/ipfilter/netinet/fil.c:2455 (ipf filter rwlock)
  107 89413 97679 0 0 0 /usr/src/sys/contrib/ipfilter/netinet/fil.c:2142 
(ipf cache rwlock)
  907 148940 81439 1 3 7447 /usr/src/sys/kern/kern_lock.c:168 
(lockbuilder mtxpool)
  1764 152282 63435 2 438 336763 /usr/src/sys/net/route.c:197 (rtentry)

And in the good case:

  1738 821795 553033 1 41 284 /usr/src/sys/netinet/tcp_output.c:253 (so_snd)
  2770 983643 490815 2 6 54 /usr/src/sys/kern/kern_lock.c:168 
(lockbuilder mtxpool)
  106 430941 477500 0 5555 4507 /usr/src/sys/netinet/ip_fw2.c:164 (IPFW 
static rules)
  95 423926 477500 0 4412 5645 /usr/src/sys/netinet/ip_fw2.c:156 (IPFW 
static rules)
  94 427239 477500 0 6323 7453 /usr/src/sys/net/pfil.c:63 (pfil_head_mtx)
  82 432359 477500 0 5244 5768 /usr/src/sys/net/pfil.c:71 (pfil_head_mtx)
  296 4751550 477498 9 20837 23019 
/usr/src/sys/contrib/ipfilter/netinet/fil.c:2320 (ipf filter load/unload 
mutex)
  85 2913118 477498 6 0 0 
/usr/src/sys/contrib/ipfilter/netinet/fil.c:2455 (ipf filter rwlock)
  55 473891 477498 0 0 0 
/usr/src/sys/contrib/ipfilter/netinet/fil.c:2142 (ipf cache rwlock)
  59 291035 309222 0 0 0 
/usr/src/sys/contrib/ipfilter/netinet/fil.c:2169 (ipf cache rwlock)
  1627 697811 305094 2 2161 2535 /usr/src/sys/net/route.c:147 (radix 
node head)
  232 804172 305094 2 12193 6500 /usr/src/sys/net/route.c:197 (rtentry)
  148 892580 303518 2 594 649 /usr/src/sys/net/route.c:1281 (rtentry)
  145 584970 303518 1 13479 11148 /usr/src/sys/net/route.c:1265 (rtentry)
  121 282669 303518 0 3529 886 /usr/src/sys/net/if_ethersubr.c:409 (em0)

Here you can see that high UMA activity happens in periods of low disk 
performance. But I'm not sure whether this is a root of the problem, not 
a consequence.

I have similiar servers around doing the same things, and they work 
fine. Also I had the same problem a year ago with another project and 
that time nothing helped and i had to install Linux.

I can provide additional information regarding this server if needed.
What else can I try to solve the problem???

With best regards,
Alexey Popov