From owner-freebsd-fs@FreeBSD.ORG Fri Dec 7 11:00:20 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C79C4D12 for ; Fri, 7 Dec 2012 11:00:20 +0000 (UTC) (envelope-from mattblists@icritical.com) Received: from mail1.icritical.com (mail1.icritical.com [93.95.13.41]) by mx1.freebsd.org (Postfix) with SMTP id 0764F8FC08 for ; Fri, 7 Dec 2012 11:00:19 +0000 (UTC) Received: (qmail 25251 invoked from network); 7 Dec 2012 10:55:51 -0000 Received: from localhost (127.0.0.1) by mail1.icritical.com with SMTP; 7 Dec 2012 10:55:51 -0000 Received: (qmail 25242 invoked by uid 599); 7 Dec 2012 10:55:50 -0000 Received: from unknown (HELO PDC002.icritical.int) (212.57.254.146) by mail1.icritical.com (qpsmtpd/0.28) with ESMTP; Fri, 07 Dec 2012 10:55:50 +0000 Message-ID: <50C1CB34.3000308@icritical.com> Date: Fri, 7 Dec 2012 10:55:48 +0000 From: Matt Burke User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:15.0) Gecko/20120906 Thunderbird/15.0 MIME-Version: 1.0 To: Subject: ZFS hang Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-TLS-Incoming: YES X-Virus-Scanned: by iCritical at mail1.icritical.com X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Dec 2012 11:00:20 -0000 Hi. I've just experienced a filesystem hang on 9.1-PRE from Sept 17th while hitting the disks hard (io-bound database workload). The disks are: 16 Spinning SAS disks (ZFS vdevs), hung off a pair of Areca 1320 HBAs 4 SATA SSDs (L2ARC), hung off onboard ICH (SATA3) and onboard C600 2 SATA SSDs (gmirror swap, zfs root fs (mirrored), main array ZIL) connected the same as the other SSDs Symptoms are any filesystem activity (login, starting gstat, tab completion) is hanging, with processes either in state 'zfs' or 'zio->io_cv)'. CPUs are idle. 'zpool status -v' doesn't hang. All disks online, no errors. In the debugger, I got the following: 118 0 0 0 DL ipmireq 0xfffffe0038bfcbb8 [ipmi0: kcs] 108 0 0 0 DL crypto_r 0xffffffff81a31900 [crypto returns] 107 0 0 0 DL crypto_w 0xffffffff81a318c0 [crypto] 19 0 0 0 DL m:w1 0xfffffe0038a79400 [g_mirror swap] 18 0 0 0 DL sdflush 0xffffffff81337c58 [softdepflush] 17 0 0 0 DL zio->io_ 0xfffffe07d9633d70 [syncer] 16 0 0 0 DL vlruwt 0xfffffe00389d1000 [vnlru] 9 0 0 0 DL psleep 0xffffffff8132bf48 [bufdaemon] 8 0 0 0 DL pgzero 0xffffffff81341c9c [pagezero] 7 0 0 0 DL psleep 0xffffffff81340e68 [vmdaemon] 6 0 0 0 DL psleep 0xffffffff81340e2c [pagedaemon] 5 0 0 0 DL ccb_scan 0xffffffff812988e0 [xpt_thrd] 4 0 0 0 DL waiting_ 0xffffffff81331f60 [sctp_iterator] 3 0 0 0 DL (threaded) [zfskern] 100740 D zio->io_ 0xfffffe1f3e786a00 [txg_thread_enter] 100739 D tx->tx_q 0xfffffe0038cd0230 [txg_thread_enter] 100409 D zio->io_ 0xfffffe1f40f02a00 [txg_thread_enter] 100408 D tx->tx_q 0xfffffe0038d43e30 [txg_thread_enter] 100097 D zio->io_ 0xfffffe1e6412da00 [l2arc_feed_thread] 100096 D arc_recl 0xffffffff817d4ca0 [arc_reclaim_thread] 15 0 0 0 DL (threaded) [usb] 100083 D - 0xffffff8001f5de18 [usbus1] 100082 D - 0xffffff8001f5ddc0 [usbus1] 100081 D - 0xffffff8001f5dd68 [usbus1] 100080 D - 0xffffff8001f5dd10 [usbus1] 100078 D - 0xffffff8001f54e18 [usbus0] 100077 D - 0xffffff8001f54dc0 [usbus0] 100076 D - 0xffffff8001f54d68 [usbus0] 100075 D - 0xffffff8001f54d10 [usbus0] 2 0 0 0 DL ctl_work 0xffffff8000a6a000 [ctl_thrd] 14 0 0 0 DL - 0xffffffff812dd7c4 [yarrow] 13 0 0 0 DL (threaded) [geom] 100023 D - 0xffffffff812d6b90 [g_down] 100022 D - 0xffffffff812d6b88 [g_up] 100021 D - 0xffffffff812d6b78 [g_event] 12 0 0 0 WL (threaded) [intr] Everything else looked 'normal' (piles of stuck cron/sendmail processes, etc) 66608 66605 66608 0 DVs zfs 0xfffffe005922c638 cron db> bt 66608 Tracing pid 66608 tid 108987 td 0xfffffe164ac67000 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 __lockmgr_args() at __lockmgr_args+0x610 vop_stdlock() at vop_stdlock+0x39 VOP_LOCK1_APV() at VOP_LOCK1_APV+0x52 _vn_lock() at _vn_lock+0x47 vget() at vget+0x70 cache_lookup_times() at cache_lookup_times+0x5f5 vfs_cache_lookup() at vfs_cache_lookup+0xc8 VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x4c lookup() at lookup+0x424 namei() at namei+0x535 vn_open_cred() at vn_open_cred+0x3cb kern_openat() at kern_openat+0x1f9 amd64_syscall() at amd64_syscall+0x5ea Xfast_syscall() at Xfast_syscall+0xf7 --- syscall (5, FreeBSD ELF64, sys_open), rip = 0x800d3b0ac, rsp = --- 0x7fffffffc878, rbp = 0x7fffffffc9c0 --- (gstat, attempted to run after hang) 66578 11191 66578 0 D+ zio->io_ 0xfffffe1f3a16ad70 gstat Tracing pid 66578 tid 109053 td 0xfffffe07e012e000 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x121 zio_wait() at zio_wait+0x61 dbuf_read() at dbuf_read+0x5e5 dbuf_findbp() at dbuf_findbp+0x107 dbuf_hold_impl() at dbuf_hold_impl+0xc2 dbuf_hold() at dbuf_hold+0x1b dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x147 dmu_buf_hold_array() at dmu_buf_hold_array+0x67 dmu_read_uio() at dmu_read_uio+0x3f zfs_freebsd_read() at zfs_freebsd_read+0x3e3 VOP_READ_APV() at VOP_READ_APV+0x44 vn_read() at vn_read+0x316 dofileread() at dofileread+0xa1 kern_readv() at kern_readv+0x6c sys_read() at sys_read+0x64 amd64_syscall() at amd64_syscall+0x5ea Xfast_syscall() at Xfast_syscall+0xf7 --- syscall (3, FreeBSD ELF64, sys_read), rip = 0x8017d1b4c, rsp = --- 0x7fffffffc488, rbp = 0x7fffffffc5d0 --- 17 0 0 0 DL zio->io_ 0xfffffe07d9633d70 [syncer] Tracing pid 17 tid 100105 td 0xfffffe00389ba000 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x121 zio_wait() at zio_wait+0x61 dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x22b dmu_read() at dmu_read+0x89 space_map_load() at space_map_load+0x108 metaslab_activate() at metaslab_activate+0xdc metaslab_alloc() at metaslab_alloc+0x7b2 zio_alloc_zil() at zio_alloc_zil+0x61 zil_lwb_write_start() at zil_lwb_write_start+0x1ad zil_commit() at zil_commit+0x24a zfs_sync() at zfs_sync+0xb3 sync_fsync() at sync_fsync+0x19d VOP_FSYNC_APV() at VOP_FSYNC_APV+0x4a sync_vnode() at sync_vnode+0x15e sched_sync() at sched_sync+0x1c5 fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffa3008acbb0, rbp = 0 --- 3 0 0 0 DL (threaded) [zfskern] Tracing pid 3 tid 100740 td 0xfffffe0038c858e0 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x121 zio_wait() at zio_wait+0x61 dsl_pool_sync() at dsl_pool_sync+0xe0 spa_sync() at spa_sync+0x336 txg_sync_thread() at txg_sync_thread+0x139 fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffa303242bb0, rbp = 0 --- 100740 D zio->io_ 0xfffffe1f3e786a00 [txg_thread_enter] Tracing pid 3 tid 100740 td 0xfffffe0038c858e0 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x121 zio_wait() at zio_wait+0x61 dsl_pool_sync() at dsl_pool_sync+0xe0 spa_sync() at spa_sync+0x336 txg_sync_thread() at txg_sync_thread+0x139 fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffa303242bb0, rbp = 0 --- 100739 D tx->tx_q 0xfffffe0038cd0230 [txg_thread_enter] Tracing pid 3 tid 100739 td 0xfffffe0038c86000 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x121 txg_thread_wait() at txg_thread_wait+0x79 txg_quiesce_thread() at txg_quiesce_thread+0xb5 fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffa303239bb0, rbp = 0 --- 100409 D zio->io_ 0xfffffe1f40f02a00 [txg_thread_enter] db> bt 100409 Tracing pid 3 tid 100409 td 0xfffffe00389b9000 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x121 zio_wait() at zio_wait+0x61 dsl_pool_sync() at dsl_pool_sync+0xe0 spa_sync() at spa_sync+0x336 txg_sync_thread() at txg_sync_thread+0x139 fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffa302eb5bb0, rbp = 0 --- 100408 D tx->tx_q 0xfffffe0038d43e30 [txg_thread_enter] Tracing pid 3 tid 100408 td 0xfffffe0038c778e0 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x121 txg_thread_wait() at txg_thread_wait+0x79 txg_quiesce_thread() at txg_quiesce_thread+0xb5 fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffa302eacbb0, rbp = 0 --- 100097 D zio->io_ 0xfffffe1e6412da00 [l2arc_feed_thread] Tracing pid 3 tid 100097 td 0xfffffe0036c4d000 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x121 zio_wait() at zio_wait+0x61 l2arc_feed_thread() at l2arc_feed_thread+0x981 fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffa30075fbb0, rbp = 0 --- 100096 D arc_recl 0xffffffff817d4ca0 [arc_reclaim_thread] Tracing pid 3 tid 100096 td 0xfffffe0036c4d470 sched_switch() at sched_switch+0x13d mi_switch() at mi_switch+0x196 sleepq_timedwait() at sleepq_timedwait+0x42 _cv_timedwait() at _cv_timedwait+0x13c arc_reclaim_thread() at arc_reclaim_thread+0x29d fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffa30075abb0, rbp = 0 --- Does anyone either recognise what's going on (hopefully be able to say this was fixed in r....), or could someone give me an indication of how better to diagnose this if it happens again? I can set a dumpdev, but this machine's got 128G RAM and precious little free memory (DB server), so I doubt anyone would want to touch it! Thanks -- Sorry for the below... The information contained in this message is confidential and intended for the addressee only. If you have received this message in error, or there are any problems with its content, please contact the sender. iCritical is a trading name of Critical Software Ltd. Registered in England: 04909220. Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH. This message has been scanned for security threats by iCritical. www.icritical.com