From owner-freebsd-performance@FreeBSD.ORG  Wed Jan 30 23:54:19 2008
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5C38816A417;
	Wed, 30 Jan 2008 23:54:19 +0000 (UTC)
	(envelope-from freebsd@sopwith.solgatos.com)
Received: from schitzo.solgatos.com
	(pool-96-225-216-68.ptldor.fios.verizon.net [96.225.216.68])
	by mx1.freebsd.org (Postfix) with ESMTP id 3CA8B13C457;
	Wed, 30 Jan 2008 23:54:19 +0000 (UTC)
	(envelope-from freebsd@sopwith.solgatos.com)
Received: from schitzo.solgatos.com (localhost.home.localnet [127.0.0.1])
	by schitzo.solgatos.com (8.14.1/8.13.8) with ESMTP id m0UNFMhY008551;
	Wed, 30 Jan 2008 15:15:22 -0800
Received: from sopwith.solgatos.com (uucp@localhost)
	by schitzo.solgatos.com (8.14.1/8.13.4/Submit) with UUCP id
	m0UNFMwN008548; Wed, 30 Jan 2008 15:15:22 -0800
Received: from localhost by sopwith.solgatos.com (8.8.8/6.24)
	id XAA22476; Wed, 30 Jan 2008 23:07:23 GMT
Message-Id: <200801302307.XAA22476@sopwith.solgatos.com>
To: "Steven Hartland" <killing@multiplay.co.uk>
In-reply-to: Your message of "Wed, 30 Jan 2008 21:42:16 GMT."
	<008201c86388$fd159010$b6db87d4@multiplay.co.uk> 
Date: Wed, 30 Jan 2008 15:07:23 +0000
From: Dieter <freebsd@sopwith.solgatos.com>
Cc: freebsd-performance@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: newfs locks entire machine for 20seconds 
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Jan 2008 23:54:19 -0000

In message <008201c86388$fd159010$b6db87d4@multiplay.co.uk>, "Steven Hartland" writes:

> From: "Ivan Voras" <ivoras@freebsd.org>
> >> The machine is running with ULE on 7.0 as mention using an Areca 1220
> >> controller over 8 disks in RAID 6 + Hotspare.
> > 
> > I'd suggest you first try to reproduce the stall without ULE, while
> > keeping all other parameters exactly the same.
> 
> Ok tried with an updated 7 world / kernel as of this afternoon and with 4BSD
> instead of ULE and no difference the machine still locks up with no activity
> for anywhere from 20 to 30 seconds.
> 
> Here's a snapshot from top under cpu and io modes when the stall has occured
> [top]
> last pid:  1102;  load averages:  0.02,  0.08,  0.07                                           up 0+00:09:37  21:39:13
> 162 processes: 4 running, 145 sleeping, 13 waiting
> CPU states:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
> Mem: 60M Active, 19M Inact, 54M Wired, 56K Cache, 27M Buf, 3809M Free
> Swap: 4096M Total, 4096M Free
> 
>   PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
>    12 root        1 171 ki31     0K    16K RUN    0   8:59 97.90% idle: cpu0
>    11 root        1 171 ki31     0K    16K RUN    1   8:57 95.80% idle: cpu1
>  1102 root        1  -8    0  4752K  1256K physrd 1   0:01 19.64% newfs
>     4 root        1  -8    -     0K    16K -      0   0:00  0.10% g_down
>  1048 root        1  96    0  7656K  2544K CPU0   0   0:01  0.00% top
>  1054 root        1  96    0  7656K  2348K CPU1   1   0:01  0.00% top
>   863 root        1  96    0   131M 15768K select 0   0:00  0.00% httpd
>  1055 root        1  96    0 32928K  4656K select 0   0:00  0.00% sshd
> 
> 
> last pid:  1102;  load averages:  0.02,  0.08,  0.07                                           up 0+00:09:37  21:39:13
> 162 processes: 4 running, 145 sleeping, 13 waiting
> CPU states:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
> Mem: 60M Active, 19M Inact, 54M Wired, 56K Cache, 27M Buf, 3809M Free
> Swap: 4096M Total, 4096M Free
> 
>   PID USERNAME   VCSW  IVCSW   READ  WRITE  FAULT  TOTAL PERCENT COMMAND
>    12 root          9    154      0      0      0      0   0.00% idle: cpu0
>    11 root         28      5      0      0      0      0   0.00% idle: cpu1
>  1102 root          5      0      0      0      0      0   0.00% newfs
>     4 root         14      0      0      0      0      0   0.00% g_down
>  1048 root          1      0      0      0      0      0   0.00% top
>  1054 root          1      0      0      0      0      0   0.00% top
>   863 root          1      0      0      0      0      0   0.00% httpd
> [/top]

What *exactly* do you mean by

> machine still locks up with no activity for anywhere from 20 to 30 seconds.

Is there disk activity? (e.g. activity light(s) flashing if you have them)

Does top continue to update the screen during the 20-30 seconds?

I'm thinking that newfs has queued up a bunch of disk i/o, and other
disk i/o gets locked out, but activities that don't require any disk i/o
(like top, once it is up and running) could continue.  Is that what is
happening?