From owner-freebsd-fs@FreeBSD.ORG Mon Aug 29 20:25:43 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48B381065673 for ; Mon, 29 Aug 2011 20:25:43 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0B6878FC0C for ; Mon, 29 Aug 2011 20:25:42 +0000 (UTC) Received: by gyd10 with SMTP id 10so6147247gyd.13 for ; Mon, 29 Aug 2011 13:25:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=xY8wnhmw5VA5uBNBWZo4kd8BrqQdiyWP5JlnV3ndPRk=; b=lEvXLEclnSZXsJ820rIK76gZ+WBdd6Fx1v4nfC4e9QvhiVSS3LmYnfsQ1Z6PphnlCu dfIong+gk28X45yV7gbA1EbWk/6jSQrDYB1BqMAKpDIfjk3ssfnA9yx+To5NTQerER3F iVnMTsqPXzwNxY4QIFn3Wp92e6MI5MlxwQBng= MIME-Version: 1.0 Received: by 10.236.173.131 with SMTP id v3mr27597149yhl.112.1314647726058; Mon, 29 Aug 2011 12:55:26 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.236.102.147 with HTTP; Mon, 29 Aug 2011 12:55:26 -0700 (PDT) In-Reply-To: <1314646728.7898.44.camel@pow> References: <1314646728.7898.44.camel@pow> Date: Mon, 29 Aug 2011 12:55:26 -0700 X-Google-Sender-Auth: NRZVv_XlzhZS7ZFKKebPwgBY6o0 Message-ID: From: Artem Belevich To: luke@hybrid-logic.co.uk Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, tech@hybrid-logic.co.uk Subject: Re: ZFS hang in production on 8.2-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Aug 2011 20:25:43 -0000 On Mon, Aug 29, 2011 at 12:38 PM, Luke Marsden wrote: > Hi all, > > I've just noticed a "partial" ZFS deadlock in production on 8.2-RELEASE. > > FreeBSD XXX 8.2-RELEASE FreeBSD 8.2-RELEASE #0 r219081M: Wed Mar =A02 > 08:29:52 CET 2011 =A0 =A0 root@www4:/usr/obj/usr/src/sys/GENERIC =A0amd64 > > There are 9 'zfs rename' processes and 1 'zfs umount -f' processes hung. > Here is the procstat for the 'zfs umount -f': > > 13451 104337 zfs =A0 =A0 =A0 =A0 =A0 =A0 =A0- =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0mi_switch+0x176 > sleepq_wait+0x42 _sleep+0x317 zfsvfs_teardown+0x269 zfs_umount+0x1c4 > dounmount+0x32a unmount+0x38b syscallenter+0x1e5 syscall+0x4b > Xfast_syscall+0xe2 > > And the 'zfs rename's all look the same: > > 20361 101049 zfs =A0 =A0 =A0 =A0 =A0 =A0 =A0- =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0mi_switch+0x176 > sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV > +0x46 _vn_lock+0x47 lookup+0x6e1 namei+0x53a kern_rmdirat+0xa4 > syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 > > An 'ls' on a directory which contains most of the system's ZFS > mount-points (/hcfs) also hangs: > > 30073 101466 gnuls =A0 =A0 =A0 =A0 =A0 =A0- =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0mi_switch+0x176 > sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV > +0x46 _vn_lock+0x47 zfs_root+0x85 lookup+0x9b8 namei+0x53a vn_open_cred > +0x3ac kern_openat+0x181 syscallenter+0x1e5 syscall+0x4b Xfast_syscall > +0xe2 > > If I truss the 'ls' it hangs on the stat syscall: > stat("/hcfs",{ mode=3Ddrwxr-xr-x ,inode=3D3,size=3D2012,blksize=3D16384 }= ) =3D 0 > (0x0) > > There is also a 'find -s / ! ( -fstype zfs ) -prune -or -path /tmp > -prune -or -path /usr/tmp -prune -or -path /var/tmp -prune -or > -path /var/db/portsnap -prune -or -print' running which is also hung: > > =A02650 101674 find =A0 =A0 =A0 =A0 =A0 =A0 - =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0mi_switch+0x176 > sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV > +0x46 _vn_lock+0x47 zfs_root+0x85 lookup+0x9b8 namei+0x53a vn_open_cred > +0x3ac kern_openat+0x181 syscallenter+0x1e5 syscall+0x4b Xfast_syscall > +0xe2 > > However I/O to the presently mounted filesystems continues to work (even > on parts of filesystems which are unlikely to be cached), and 'zfs list' > showing all the filesystems (3,500 filesystems with ~100 snapshots per > filesystem) also works. > > Any activity on the structure of the ZFS hierarchy *under the hcfs > filesystem* crashes, such as a 'zfs create hpool/hcfs/test': > > 70868 101874 zfs =A0 =A0 =A0 =A0 =A0 =A0 =A0- =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0mi_switch+0x176 > sleepq_wait+0x42 __lockmgr_args+0x743 vop_stdlock+0x39 VOP_LOCK1_APV > +0x46 _vn_lock+0x47 lookup+0x6e1 namei+0x53a kern_mkdirat+0xce > syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 > > BUT "zfs create hpool/system/opt/hello" (a ZFS filesystem in the same > pool, but not rooted on hpool/hcfs) does not hang, and succeeds > normally. > > procstat -kk on the zfskern process gives: > > =A0PID =A0 =A0TID COMM =A0 =A0 =A0 =A0 =A0 =A0 TDNAME > KSTACK > =A0 =A05 100045 zfskern =A0 =A0 =A0 =A0 =A0arc_reclaim_thre mi_switch+0x1= 76 > sleepq_timedwait+0x42 _cv_timedwait+0x134 arc_reclaim_thread+0x2a9 > fork_exit+0x118 fork_trampoline+0xe > =A0 =A05 100046 zfskern =A0 =A0 =A0 =A0 =A0l2arc_feed_threa mi_switch+0x1= 76 > sleepq_timedwait+0x42 _cv_timedwait+0x134 l2arc_feed_thread+0x1ce > fork_exit+0x118 fork_trampoline+0xe > =A0 =A05 100098 zfskern =A0 =A0 =A0 =A0 =A0txg_thread_enter mi_switch+0x1= 76 > sleepq_wait+0x42 _cv_wait+0x129 txg_thread_wait+0x79 txg_quiesce_thread > +0xb5 fork_exit+0x118 fork_trampoline+0xe > =A0 =A05 100099 zfskern =A0 =A0 =A0 =A0 =A0txg_thread_enter mi_switch+0x1= 76 > sleepq_timedwait+0x42 _cv_timedwait+0x134 txg_thread_wait+0x3c > txg_sync_thread+0x365 fork_exit+0x118 fork_trampoline+0xe > > Any ideas on what might be causing this? It sounds like the bug Martin Matuska has recently fixed in FreeBSD and reported upstream to Illumos: https://www.illumos.org/issues/1313 The fix has been MFC'ed to 8-STABLE r224647 on Aug 4th. --Artem > > Thank you for supporting ZFS on FreeBSD! > > -- > Best Regards, > Luke Marsden > CTO, Hybrid Logic Ltd. > > Web: http://www.hybrid-cluster.com/ > Hybrid Web Cluster - cloud web hosting > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >