From owner-freebsd-fs@FreeBSD.ORG  Mon Aug 29 19:38:53 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B940F106566C
	for <freebsd-fs@freebsd.org>; Mon, 29 Aug 2011 19:38:53 +0000 (UTC)
	(envelope-from luke@digital-crocus.com)
Received: from mail.digital-crocus.com (node2.digital-crocus.com
	[91.209.244.128])
	by mx1.freebsd.org (Postfix) with ESMTP id 7626E8FC08
	for <freebsd-fs@freebsd.org>; Mon, 29 Aug 2011 19:38:52 +0000 (UTC)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dkselector;
	d=hybrid-logic.co.uk; 
	h=Received:Received:Subject:From:Reply-To:To:Cc:Content-Type:Organization:Date:Message-ID:Mime-Version:X-Mailer:Content-Transfer-Encoding:X-Spam-Score:X-Digital-Crocus-Maillimit:X-Authenticated-Sender:X-Complaints:X-Admin:X-Abuse;
	b=eJOA8OnBZEpzw45/VhNZ6yvAJAFntsbLQkQMUnjKnXKttuuTu9tLzqTWHdcD8kQqzDlbmCfiipk0juQfnxAuidYBKS3c9AqQrB+dQQzoHW37IivCmQh6d1U0ruWy3EwT;
Received: from luke by mail.digital-crocus.com with local (Exim 4.69 (FreeBSD))
	(envelope-from <luke@digital-crocus.com>) id 1Qy7ei-000Hh1-Kb
	for freebsd-fs@freebsd.org; Mon, 29 Aug 2011 20:38:00 +0100
Received: from 127cr.net ([78.105.122.99] helo=[192.168.1.23])
	by mail.digital-crocus.com with esmtpa (Exim 4.69 (FreeBSD))
	(envelope-from <luke-lists@hybrid-logic.co.uk>)
	id 1Qy7ei-000Hgf-7y; Mon, 29 Aug 2011 20:38:00 +0100
From: Luke Marsden <luke-lists@hybrid-logic.co.uk>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset="UTF-8"
Organization: Hybrid Web Cluster
Date: Mon, 29 Aug 2011 20:38:48 +0100
Message-ID: <1314646728.7898.44.camel@pow>
Mime-Version: 1.0
X-Mailer: Evolution 2.32.2 
Content-Transfer-Encoding: 7bit
X-Spam-Score: -1.0
X-Digital-Crocus-Maillimit: done
X-Authenticated-Sender: luke
X-Complaints: abuse@digital-crocus.com
X-Admin: admin@digital-crocus.com
X-Abuse: abuse@digital-crocus.com (Please include full headers in abuse
	reports)
Cc: tech@hybrid-logic.co.uk
Subject: ZFS hang in production on 8.2-RELEASE
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: luke@hybrid-logic.co.uk
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Aug 2011 19:38:53 -0000

Hi all,

I've just noticed a "partial" ZFS deadlock in production on 8.2-RELEASE.

FreeBSD XXX 8.2-RELEASE FreeBSD 8.2-RELEASE #0 r219081M: Wed Mar  2
08:29:52 CET 2011     root@www4:/usr/obj/usr/src/sys/GENERIC  amd64

There are 9 'zfs rename' processes and 1 'zfs umount -f' processes hung.
Here is the procstat for the 'zfs umount -f':

13451 104337 zfs              -                mi_switch+0x176
sleepq_wait+0x42 _sleep+0x317 zfsvfs_teardown+0x269 zfs_umount+0x1c4
dounmount+0x32a unmount+0x38b syscallenter+0x1e5 syscall+0x4b
Xfast_syscall+0xe2 

And the 'zfs rename's all look the same:

20361 101049 zfs              -                mi_switch+0x176
sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV
+0x46 _vn_lock+0x47 lookup+0x6e1 namei+0x53a kern_rmdirat+0xa4
syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 

An 'ls' on a directory which contains most of the system's ZFS
mount-points (/hcfs) also hangs:

30073 101466 gnuls            -                mi_switch+0x176
sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV
+0x46 _vn_lock+0x47 zfs_root+0x85 lookup+0x9b8 namei+0x53a vn_open_cred
+0x3ac kern_openat+0x181 syscallenter+0x1e5 syscall+0x4b Xfast_syscall
+0xe2 

If I truss the 'ls' it hangs on the stat syscall:
stat("/hcfs",{ mode=drwxr-xr-x ,inode=3,size=2012,blksize=16384 }) = 0
(0x0)

There is also a 'find -s / ! ( -fstype zfs ) -prune -or -path /tmp
-prune -or -path /usr/tmp -prune -or -path /var/tmp -prune -or
-path /var/db/portsnap -prune -or -print' running which is also hung:

 2650 101674 find             -                mi_switch+0x176
sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV
+0x46 _vn_lock+0x47 zfs_root+0x85 lookup+0x9b8 namei+0x53a vn_open_cred
+0x3ac kern_openat+0x181 syscallenter+0x1e5 syscall+0x4b Xfast_syscall
+0xe2 

However I/O to the presently mounted filesystems continues to work (even
on parts of filesystems which are unlikely to be cached), and 'zfs list'
showing all the filesystems (3,500 filesystems with ~100 snapshots per
filesystem) also works.

Any activity on the structure of the ZFS hierarchy *under the hcfs
filesystem* crashes, such as a 'zfs create hpool/hcfs/test':

70868 101874 zfs              -                mi_switch+0x176
sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV
+0x46 _vn_lock+0x47 lookup+0x6e1 namei+0x53a kern_mkdirat+0xce
syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 

BUT "zfs create hpool/system/opt/hello" (a ZFS filesystem in the same
pool, but not rooted on hpool/hcfs) does not hang, and succeeds
normally.

procstat -kk on the zfskern process gives:

  PID    TID COMM             TDNAME
KSTACK                       
    5 100045 zfskern          arc_reclaim_thre mi_switch+0x176
sleepq_timedwait+0x42 _cv_timedwait+0x134 arc_reclaim_thread+0x2a9
fork_exit+0x118 fork_trampoline+0xe 
    5 100046 zfskern          l2arc_feed_threa mi_switch+0x176
sleepq_timedwait+0x42 _cv_timedwait+0x134 l2arc_feed_thread+0x1ce
fork_exit+0x118 fork_trampoline+0xe 
    5 100098 zfskern          txg_thread_enter mi_switch+0x176
sleepq_wait+0x42 _cv_wait+0x129 txg_thread_wait+0x79 txg_quiesce_thread
+0xb5 fork_exit+0x118 fork_trampoline+0xe 
    5 100099 zfskern          txg_thread_enter mi_switch+0x176
sleepq_timedwait+0x42 _cv_timedwait+0x134 txg_thread_wait+0x3c
txg_sync_thread+0x365 fork_exit+0x118 fork_trampoline+0xe 

Any ideas on what might be causing this?

Thank you for supporting ZFS on FreeBSD!

-- 
Best Regards,
Luke Marsden
CTO, Hybrid Logic Ltd.

Web: http://www.hybrid-cluster.com/
Hybrid Web Cluster - cloud web hosting