From owner-freebsd-current@FreeBSD.ORG  Mon Jan  2 20:47:17 2012
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A0360106566B
	for <current@FreeBSD.org>; Mon,  2 Jan 2012 20:47:17 +0000 (UTC)
	(envelope-from truckman@FreeBSD.org)
Received: from gw.catspoiler.org (gw.catspoiler.org [75.1.14.242])
	by mx1.freebsd.org (Postfix) with ESMTP id 603EA8FC08
	for <current@FreeBSD.org>; Mon,  2 Jan 2012 20:47:17 +0000 (UTC)
Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2])
	by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id q02Kl3IM005792;
	Mon, 2 Jan 2012 12:47:07 -0800 (PST)
	(envelope-from truckman@FreeBSD.org)
Message-Id: <201201022047.q02Kl3IM005792@gw.catspoiler.org>
Date: Mon, 2 Jan 2012 12:47:03 -0800 (PST)
From: Don Lewis <truckman@FreeBSD.org>
To: flo@FreeBSD.org
In-Reply-To: <4F01F8FD.4020901@FreeBSD.org>
MIME-Version: 1.0
Content-Type: TEXT/plain; charset=us-ascii
Cc: attilio@FreeBSD.org, current@FreeBSD.org, mckusick@mckusick.com,
	phk@phk.freebsd.dk, kib@FreeBSD.org, seanbru@yahoo-inc.com
Subject: Re: dogfooding over in clusteradm land
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Jan 2012 20:47:17 -0000

On  2 Jan, Florian Smeets wrote:
> On 29.12.11 01:04, Kirk McKusick wrote:
>> Rather than changing BKVASIZE, I would try running the cvs2svn
>> conversion on a 16K/2K filesystem and see if that sorts out the
>> problem. If it does, it tells us that doubling the main block
>> size and reducing the number of buffers by half is the problem.
>> If that is the problem, then we will have to increase the KVM
>> allocated to the buffer cache.
>> 
> 
> This does not make a difference. I tried on 32K/4K with/without journal
> and on 16K/2K all exhibit the same problem. At some point during the
> cvs2svn conversion the sycer starts to use 100% CPU. The whole process
> hangs at that point sometimes for hours, from time to time it does
> continue doing some work, but really really slow. It's usually between
> revision 210000 and 220000, when the resulting svn file gets bigger than
> about 11-12Gb. At that point an ls in the target dir hangs in state ufs.
> 
> I broke into ddb and ran all commands which i thought could be useful.
> The output is at http://tb.smeets.im/~flo/giant-ape_syncer.txt

Tracing command syncer pid 9 tid 100183 td 0xfffffe00120e9000
cpustop_handler() at cpustop_handler+0x2b
ipi_nmi_handler() at ipi_nmi_handler+0x50
trap() at trap+0x1a8
nmi_calltrap() at nmi_calltrap+0x8
--- trap 0x13, rip = 0xffffffff8082ba43, rsp = 0xffffff8000270fe0, rbp = 0xffffff88c97829a0 ---
_mtx_assert() at _mtx_assert+0x13
pmap_remove_write() at pmap_remove_write+0x38
vm_object_page_remove_write() at vm_object_page_remove_write+0x1f
vm_object_page_clean() at vm_object_page_clean+0x14d
vfs_msync() at vfs_msync+0xf1
sync_fsync() at sync_fsync+0x12a
sync_vnode() at sync_vnode+0x157
sched_sync() at sched_sync+0x1d1
fork_exit() at fork_exit+0x135
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff88c9782d00, rbp = 0 ---

I thinks this explains why the r228838 patch seems to help the problem.
Instead of an application call to msync(), you're getting bitten by the
syncer doing the equivalent.  I don't know why the syncer is CPU bound,
though.  From my understanding of the patch it only optimizes the I/O.
Without the patch, I would expect that the syncer would just spend a lot
of time waiting on I/O.  My guess is that this is actually a vm problem.
There are nested loops in vm_object_page_clean() and
vm_object_page_remove_write(), so you could be doing something that's
causing lots of looping in that code.

I think that ls is hanging because it's stumbling across the vnode that
the syncer has locked.