From owner-freebsd-fs@FreeBSD.ORG Wed Aug 27 20:43:08 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EF76FFC3; Wed, 27 Aug 2014 20:43:08 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 93AD73A33; Wed, 27 Aug 2014 20:43:08 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 46FCCB948; Wed, 27 Aug 2014 16:43:07 -0400 (EDT) From: John Baldwin To: freebsd-fs@freebsd.org Subject: Re: Process enters unkillable state and somewhat wedges zfs Date: Wed, 27 Aug 2014 16:39:09 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; ) References: <53F25402.1020907@caida.org> <53F4E3C0.6000406@FreeBSD.org> <53FB716D.1060007@caida.org> In-Reply-To: <53FB716D.1060007@caida.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201408271639.09352.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 27 Aug 2014 16:43:07 -0400 (EDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2014 20:43:09 -0000 On Monday, August 25, 2014 1:25:01 pm Daniel Andersen wrote: > On 08/20/2014 11:06 AM, Bryan Drewery wrote: > > On 8/18/2014 2:29 PM, Daniel Andersen wrote: > >> We are currently experiencing a strange problem that sort of locks up one of our zfs pools. This is on a FreeBSD 10 > >> machine. Let me give a rough layout of our system to better describe what is happening: > > > > When it happens get the output of 'procstat -kka|grep zfs' please. > > > > I'd read something about that while attempting to debug the last few times, so I happen to have that output, I believe. > Here are the last two examples ( the latter having had more wedged processes ): > > 0 100687 kernel zfs_vn_rele_task mi_switch+0xde sleepq_wait+0x3a _sleep+0x26f taskqueue_thread_loop+0xd5 > fork_exit+0x9a fork_trampoline+0xe > 0 101055 kernel zfs_vn_rele_task mi_switch+0xde sleepq_wait+0x3a _sleep+0x26f taskqueue_thread_loop+0xd5 > fork_exit+0x9a fork_trampoline+0xe > 38 100306 zfskern arc_reclaim_thre mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > arc_reclaim_thread+0x302 fork_exit+0x9a fork_trampoline+0xe > 38 100307 zfskern l2arc_feed_threa mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > l2arc_feed_thread+0xb29 fork_exit+0x9a fork_trampoline+0xe > 38 100672 zfskern trim tank mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > trim_thread+0x9b fork_exit+0x9a fork_trampoline+0xe > 38 100688 zfskern txg_thread_enter mi_switch+0xde sleepq_wait+0x3a _cv_wait+0x16d txg_quiesce_thread+0x30b > fork_exit+0x9a fork_trampoline+0xe > 38 100689 zfskern txg_thread_enter mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > txg_sync_thread+0x1dd fork_exit+0x9a fork_trampoline+0xe > 38 101054 zfskern trim work mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > trim_thread+0x9b fork_exit+0x9a fork_trampoline+0xe > 38 101056 zfskern txg_thread_enter mi_switch+0xde sleepq_wait+0x3a _cv_wait+0x16d txg_quiesce_thread+0x30b > fork_exit+0x9a fork_trampoline+0xe > 38 101057 zfskern txg_thread_enter mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > txg_sync_thread+0x1dd fork_exit+0x9a fork_trampoline+0xe > 16656 101610 ls - mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_root+0x8f lookup+0x7f0 namei+0x504 kern_statat_vnhook+0xa5 > sys_lstat+0x30 amd64_syscall+0x357 Xfast_syscall+0xfb > > > and > > 0 100688 kernel zfs_vn_rele_task mi_switch+0xde sleepq_wait+0x3a _sleep+0x26f taskqueue_thread_loop+0xd5 > fork_exit+0x9a fork_trampoline+0xe > 0 101056 kernel zfs_vn_rele_task mi_switch+0xde sleepq_wait+0x3a _sleep+0x26f taskqueue_thread_loop+0xd5 > fork_exit+0x9a fork_trampoline+0xe > 38 100306 zfskern arc_reclaim_thre mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > arc_reclaim_thread+0x302 fork_exit+0x9a fork_trampoline+0xe > 38 100307 zfskern l2arc_feed_threa mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > l2arc_feed_thread+0xb29 fork_exit+0x9a fork_trampoline+0xe > 38 100673 zfskern trim tank mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > trim_thread+0x9b fork_exit+0x9a fork_trampoline+0xe > 38 100689 zfskern txg_thread_enter mi_switch+0xde sleepq_wait+0x3a _cv_wait+0x16d txg_quiesce_thread+0x30b > fork_exit+0x9a fork_trampoline+0xe > 38 100690 zfskern txg_thread_enter mi_switch+0xde sleepq_wait+0x3a _cv_wait+0x16d zio_wait+0x5b > vdev_uberblock_sync_list+0xad vdev_config_sync+0x118 spa_sync+0x827 txg_sync_thread+0x375 fork_exit+0x9a > fork_trampoline+0xe > 38 101055 zfskern trim work mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > trim_thread+0x9b fork_exit+0x9a fork_trampoline+0xe > 38 101057 zfskern txg_thread_enter mi_switch+0xde sleepq_wait+0x3a _cv_wait+0x16d txg_quiesce_thread+0x30b > fork_exit+0x9a fork_trampoline+0xe > 38 101058 zfskern txg_thread_enter mi_switch+0xde sleepq_timedwait+0x3a _cv_timedwait_sbt+0x18d > txg_sync_thread+0x1dd fork_exit+0x9a fork_trampoline+0xe > 2638 101160 nfsd nfsd: service mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_vget+0xfe nullfs_vget+0x54 nfsrvd_readdirplus+0x905 > nfsrvd_dorpc+0x773 nfssvc_program+0x4f6 svc_run_internal+0x1f9 svc_thread_start+0xb fork_exit+0x9a fork_trampoline+0xe > 3891 101641 collectd - mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_root+0x8f lookup+0x7f0 namei+0x504 kern_statfs+0x6f > sys_statfs+0x2c amd64_syscall+0x357 Xfast_syscall+0xfb > 11808 101802 sendsize - mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_root+0x8f lookup+0x7f0 namei+0x504 kern_statat_vnhook+0xa5 > sys_stat+0x2d amd64_syscall+0x357 Xfast_syscall+0xfb > 13900 101899 sendsize - mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_root+0x8f lookup+0x7f0 namei+0x504 kern_statat_vnhook+0xa5 > sys_stat+0x2d amd64_syscall+0x357 Xfast_syscall+0xfb > 17961 102536 df - mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_root+0x8f lookup+0x7f0 namei+0x504 kern_statfs+0x6f > sys_statfs+0x2c amd64_syscall+0x357 Xfast_syscall+0xfb > 38964 102279 ls - mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_root+0x8f lookup+0x7f0 namei+0x504 kern_statat_vnhook+0xa5 > sys_lstat+0x30 amd64_syscall+0x357 Xfast_syscall+0xfb > 42315 102539 df - mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_root+0x8f lookup+0x7f0 namei+0x504 kern_statfs+0x6f > sys_statfs+0x2c amd64_syscall+0x357 Xfast_syscall+0xfb > 42335 101782 sendsize - mi_switch+0xde sleepq_wait+0x3a sleeplk+0x11c __lockmgr_args+0x950 > vop_stdlock+0x3c VOP_LOCK1_APV+0x9d _vn_lock+0x43 zfs_root+0x8f lookup+0x7f0 namei+0x504 kern_statat_vnhook+0xa5 > sys_stat+0x2d amd64_syscall+0x357 Xfast_syscall+0xfb These are all blocked in "zfs" then. (For future reference, the 'mwchan' field that you see as 'STATE' in top or via 'ps O mwchan' is more detailed than the 'D' state.) To diagnose this further, you would need to see which thread holds the ZFS vnode lock these threads need. I have some gdb scripts you can use to do that at www.freebsd.org/~jhb/gdb/. You would want to download 'gdb6*' files from there and then do this as root: # cd /path/to/gdb/files # kgdb (kgdb) source gdb6 (kgdb) sleepchain 42335 Where '42335' is the pid of some process stuck in "zfs". -- John Baldwin