From owner-freebsd-fs@FreeBSD.ORG  Fri Jul 13 13:47:22 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id EB984106566B
	for <freebsd-fs@freebsd.org>; Fri, 13 Jul 2012 13:47:22 +0000 (UTC)
	(envelope-from freebsd@penx.com)
Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2])
	by mx1.freebsd.org (Postfix) with ESMTP id B87A38FC15
	for <freebsd-fs@freebsd.org>; Fri, 13 Jul 2012 13:47:22 +0000 (UTC)
Received: from [127.0.0.1] (localhost [127.0.0.1])
	by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q6DDlIuG082016;
	Fri, 13 Jul 2012 06:47:18 -0700 (PDT)
	(envelope-from freebsd@penx.com)
From: Dennis Glatting <freebsd@penx.com>
To: Volodymyr Kostyrko <c.kworr@gmail.com>
In-Reply-To: <4FFFE82B.6010109@gmail.com>
References: <1341864787.32803.43.camel@btw.pki2.com>
	<4FFFE82B.6010109@gmail.com>
Content-Type: text/plain; charset="us-ascii"
Date: Fri, 13 Jul 2012 06:47:18 -0700
Message-ID: <1342187238.60733.27.camel@btw.pki2.com>
Mime-Version: 1.0
X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port 
Content-Transfer-Encoding: 7bit
X-yoursite-MailScanner-Information: Dennis Glatting
X-yoursite-MailScanner-ID: q6DDlIuG082016
X-yoursite-MailScanner: Found to be clean
X-MailScanner-From: freebsd@penx.com
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS hanging
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Jul 2012 13:47:23 -0000

On Fri, 2012-07-13 at 12:19 +0300, Volodymyr Kostyrko wrote:
> Dennis Glatting wrote:
> > I have a ZFS array of disks where the system simply stops as if forever
> > blocked by some IO mutex. This happens often and the following is the
> > output of top:
> 
> Try switching to clang. Some time ago I was hit by different error - 
> some process hangs indefinitely and can't be killed. After building 
> system with clang I obtained a core dump at first reboot and research 
> turned out that there was some broken directory entry in file system. 
> Recreating damaged zfs filesystem (leaving all other pool intact) solved 
> my problem completely.
> 

I am using clang except on my CVS mirrors.

I found on the mirrors that the mirror itself cannot update from itself
but other hosts can update from the mirror. Somewhere in that
M3/assembly muck something crashes in the process. The only way around
the problem is to compile the /OS/ using GCC.

On the system in question(iirc) I rebuilt the pool yesterday -- I'm in
the process of updating parts across my systems. I also wanted to fool
around with different ZFS architectures. This morning, with a load
average throughout the night of 42 on a 32 core system writing 4TB of
data, it is still alive and kicking but its early in the run.