From owner-freebsd-fs@FreeBSD.ORG  Mon Aug 29 20:54:10 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 777F61065673;
	Mon, 29 Aug 2011 20:54:10 +0000 (UTC) (envelope-from mm@FreeBSD.org)
Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3])
	by mx1.freebsd.org (Postfix) with ESMTP id EE04F8FC0A;
	Mon, 29 Aug 2011 20:54:09 +0000 (UTC)
Received: from core.vx.sk (localhost [127.0.0.1])
	by mail.vx.sk (Postfix) with ESMTP id 4D939190586;
	Mon, 29 Aug 2011 22:54:09 +0200 (CEST)
X-Virus-Scanned: amavisd-new at mail.vx.sk
Received: from mail.vx.sk ([127.0.0.1])
	by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id CqopX2gWjr6A; Mon, 29 Aug 2011 22:54:06 +0200 (CEST)
Received: from [10.9.8.1] (188-167-78-15.dynamic.chello.sk [188.167.78.15])
	by mail.vx.sk (Postfix) with ESMTPSA id 99B71190578;
	Mon, 29 Aug 2011 22:54:06 +0200 (CEST)
Message-ID: <4E5BFC6F.5080507@FreeBSD.org>
Date: Mon, 29 Aug 2011 22:54:07 +0200
From: Martin Matuska <mm@FreeBSD.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
	rv:6.0) Gecko/20110812 Thunderbird/6.0
MIME-Version: 1.0
To: Artem Belevich <art@freebsd.org>
References: <1314646728.7898.44.camel@pow>
	<CAFqOu6gHvwxiOkFZ0Enh3VRHcs3aD=gH4u_6=XuhfYXg5NnkpQ@mail.gmail.com>
In-Reply-To: <CAFqOu6gHvwxiOkFZ0Enh3VRHcs3aD=gH4u_6=XuhfYXg5NnkpQ@mail.gmail.com>
X-Enigmail-Version: 1.3.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, tech@hybrid-logic.co.uk, luke@hybrid-logic.co.uk
Subject: Re: ZFS hang in production on 8.2-RELEASE
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Aug 2011 20:54:10 -0000

On 29. 8. 2011 21:55, Artem Belevich wrote:
> On Mon, Aug 29, 2011 at 12:38 PM, Luke Marsden
> <luke-lists@hybrid-logic.co.uk> wrote:
>> Hi all,
>>
>> I've just noticed a "partial" ZFS deadlock in production on 8.2-RELEASE.
>>
>> FreeBSD XXX 8.2-RELEASE FreeBSD 8.2-RELEASE #0 r219081M: Wed Mar  2
>> 08:29:52 CET 2011     root@www4:/usr/obj/usr/src/sys/GENERIC  amd64
>>
>> There are 9 'zfs rename' processes and 1 'zfs umount -f' processes hung.
>> Here is the procstat for the 'zfs umount -f':
>>
>> 13451 104337 zfs              -                mi_switch+0x176
>> sleepq_wait+0x42 _sleep+0x317 zfsvfs_teardown+0x269 zfs_umount+0x1c4
>> dounmount+0x32a unmount+0x38b syscallenter+0x1e5 syscall+0x4b
>> Xfast_syscall+0xe2
>>
>> And the 'zfs rename's all look the same:
>>
>> 20361 101049 zfs              -                mi_switch+0x176
>> sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV
>> +0x46 _vn_lock+0x47 lookup+0x6e1 namei+0x53a kern_rmdirat+0xa4
>> syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2
>>
>> An 'ls' on a directory which contains most of the system's ZFS
>> mount-points (/hcfs) also hangs:
>>
>> 30073 101466 gnuls            -                mi_switch+0x176
>> sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV
>> +0x46 _vn_lock+0x47 zfs_root+0x85 lookup+0x9b8 namei+0x53a vn_open_cred
>> +0x3ac kern_openat+0x181 syscallenter+0x1e5 syscall+0x4b Xfast_syscall
>> +0xe2
>>
>> If I truss the 'ls' it hangs on the stat syscall:
>> stat("/hcfs",{ mode=drwxr-xr-x ,inode=3,size=2012,blksize=16384 }) = 0
>> (0x0)
>>
>> There is also a 'find -s / ! ( -fstype zfs ) -prune -or -path /tmp
>> -prune -or -path /usr/tmp -prune -or -path /var/tmp -prune -or
>> -path /var/db/portsnap -prune -or -print' running which is also hung:
>>
>>  2650 101674 find             -                mi_switch+0x176
>> sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV
>> +0x46 _vn_lock+0x47 zfs_root+0x85 lookup+0x9b8 namei+0x53a vn_open_cred
>> +0x3ac kern_openat+0x181 syscallenter+0x1e5 syscall+0x4b Xfast_syscall
>> +0xe2
>>
>> However I/O to the presently mounted filesystems continues to work (even
>> on parts of filesystems which are unlikely to be cached), and 'zfs list'
>> showing all the filesystems (3,500 filesystems with ~100 snapshots per
>> filesystem) also works.
>>
>> Any activity on the structure of the ZFS hierarchy *under the hcfs
>> filesystem* crashes, such as a 'zfs create hpool/hcfs/test':
>>
>> 70868 101874 zfs              -                mi_switch+0x176
>> sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV
>> +0x46 _vn_lock+0x47 lookup+0x6e1 namei+0x53a kern_mkdirat+0xce
>> syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2
>>
>> BUT "zfs create hpool/system/opt/hello" (a ZFS filesystem in the same
>> pool, but not rooted on hpool/hcfs) does not hang, and succeeds
>> normally.
>>
>> procstat -kk on the zfskern process gives:
>>
>>  PID    TID COMM             TDNAME
>> KSTACK
>>    5 100045 zfskern          arc_reclaim_thre mi_switch+0x176
>> sleepq_timedwait+0x42 _cv_timedwait+0x134 arc_reclaim_thread+0x2a9
>> fork_exit+0x118 fork_trampoline+0xe
>>    5 100046 zfskern          l2arc_feed_threa mi_switch+0x176
>> sleepq_timedwait+0x42 _cv_timedwait+0x134 l2arc_feed_thread+0x1ce
>> fork_exit+0x118 fork_trampoline+0xe
>>    5 100098 zfskern          txg_thread_enter mi_switch+0x176
>> sleepq_wait+0x42 _cv_wait+0x129 txg_thread_wait+0x79 txg_quiesce_thread
>> +0xb5 fork_exit+0x118 fork_trampoline+0xe
>>    5 100099 zfskern          txg_thread_enter mi_switch+0x176
>> sleepq_timedwait+0x42 _cv_timedwait+0x134 txg_thread_wait+0x3c
>> txg_sync_thread+0x365 fork_exit+0x118 fork_trampoline+0xe
>>
>> Any ideas on what might be causing this?
> It sounds like the bug Martin Matuska has recently fixed in FreeBSD
> and reported upstream to Illumos:
> https://www.illumos.org/issues/1313
>
> The fix has been MFC'ed to 8-STABLE r224647 on Aug 4th.
>
> --Artem
No, I think this is more likely fixed by pjd's bugfix in r224791 (MFC'ed
to stable/8 as r225100).

The corresponding patch is:
http://people.freebsd.org/~pjd/patches/zfsdev_state_lock.patch

-- 
Martin Matuska
FreeBSD committer
http://blog.vx.sk