From owner-freebsd-current@FreeBSD.ORG  Thu Jul  7 12:21:16 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0D9AC1065672
	for <freebsd-current@freebsd.org>; Thu,  7 Jul 2011 12:21:16 +0000 (UTC)
	(envelope-from freebsd-current@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 912E38FC12
	for <freebsd-current@freebsd.org>; Thu,  7 Jul 2011 12:21:15 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-current@m.gmane.org>) id 1QenZx-0005eO-52
	for freebsd-current@freebsd.org; Thu, 07 Jul 2011 14:21:13 +0200
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-current@freebsd.org>; Thu, 07 Jul 2011 14:21:13 +0200
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-current@freebsd.org>; Thu, 07 Jul 2011 14:21:13 +0200
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-current@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Thu, 07 Jul 2011 14:20:59 +0200
Lines: 34
Message-ID: <iv48bc$nab$1@dough.gmane.org>
References: <20110706170132.GA68775@troutmask.apl.washington.edu>
	<5080.1309971941@critter.freebsd.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <5080.1309971941@critter.freebsd.dk>
X-Enigmail-Version: 1.1.2
Cc: freebsd-questions@freebsd.org
Subject: Re: Heavy I/O blocks FreeBSD box for several seconds
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Jul 2011 12:21:16 -0000

On 06/07/2011 19:05, Poul-Henning Kamp wrote:
> In message<20110706170132.GA68775@troutmask.apl.washington.edu>, Steve Kargl w
> rites:
>
>> I periodically ran the same type test in the 2008 post over the
>> last three years.  Nothing has changed.  I even set up an account
>> on one node in my cluster for jeffr to use.  He was too busy to
>> investigate at that time.
>
> Isn't this just the lemming-syncer hurling every dirty block over
> the cliff at the same time ?

Occasionally there have been reports of there being "something" (tm) 
which causes CPU-bound processes to stall / starve when heavy file 
system IO is present. I think I have also noticed this occasionally but 
it was never serious enough to pursue it - only X11 lagging.

The problem is - all this is sporadic and thus anecdotal.

AFAIK, the "lemming-syncer" behaviour shouldn't stall anything if it's 
the only thing which is "wrong", right? I know one issue which might 
seemingly stall all IO: since there is only one IO queue, if it is 
filled with requests which take a long time, all other IO is blocked; as 
an example: doing simultaneous writes on a slow USB flash stick and on a 
hard drive will soon result in the queue being filled with slow USB 
requests, which will by the nature of the queue "push out" fast disk 
requests, making the drive look very slow (this is most noticable with 
large hirunningspace). But this doesn't seem to directly correlate with 
the OP's problem.

Maybe this particular problem can be tested by having two drives - one 
to provoke this kind of stalling, and one to test if any IO can be done 
on it while the stall happens on the first one.