From owner-freebsd-fs@FreeBSD.ORG Sun Feb 27 12:55:04 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 692AE106566C for ; Sun, 27 Feb 2011 12:55:04 +0000 (UTC) (envelope-from tim.bishop@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id 219248FC13 for ; Sun, 27 Feb 2011 12:55:03 +0000 (UTC) Received: by gxk7 with SMTP id 7so1417151gxk.13 for ; Sun, 27 Feb 2011 04:55:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:reply-to:date:message-id:subject :from:to:content-type; bh=5s1WRRMofUmP8vkcR0wA49bPyYxARH0kKWea4hboh9M=; b=bD9eHs9sRrhWDAJSYPzzGFSZY1Zhptnumqe0CEYt0wBCnxddjA18drYzQTz1nF20F5 JuYN1JpdMCAX18nvX92gOAHCdcPa0iw/kiMbl0mHHwY9g/ip+ueLjDsG/TI7aHVzDdEu IipCDd2Zr/RMFOSVkOfqpUJNz9mym5Z8VAt0g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:date:message-id:subject:from:to:content-type; b=vqqphp6ijRaAyPl4PBQOe+k3UlUPJjQqpb2yL/LqEVETCDB6eO1mzko64ImSHe5hia NW/cZKoJadmYXpjkjV+/P7e7/BpDeAowzdVORliM0iYWDjGMSnFLzKFfUzDuw5EGWLEx EvZ+kPKnXwQ7A3EXAFb2ReQFwulUjzgadynXc= MIME-Version: 1.0 Received: by 10.147.51.7 with SMTP id d7mr2739386yak.4.1298809937284; Sun, 27 Feb 2011 04:32:17 -0800 (PST) Received: by 10.147.136.20 with HTTP; Sun, 27 Feb 2011 04:32:17 -0800 (PST) Date: Sun, 27 Feb 2011 12:32:17 +0000 Message-ID: From: Tim Bishop To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: ZFS system unresponsive X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: tim@bishnet.net List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Feb 2011 12:55:04 -0000 Hi all, I have a ZFS system that has become unresponsive. It's running amd64 8-STABLE as of approximately 20 Dec. It has a UFS-based root file system and then a ZFS mirror for a handful of jails. It seems to get in to this state occasionally, but eventually can unblock itself. This may take hours though. top -HSj shows the following processes active: PID JID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 0 0 root -16 0 0K 1456K zio->i 0 28.9H 7.08% {zio_write_issue} 5 0 root -8 - 0K 60K zio->i 0 776:59 0.29% {txg_thread_enter} A procstat on those processes shows: 0 100068 kernel zio_write_issue mi_switch sleepq_wait _cv_wait zio_wait dmu_buf_hold_array_by_dnode dmu_read space_map_load metaslab_activate metaslab_alloc zio_dva_allocate zio_execute taskq_run_safe taskqueue_run_locked taskqueue_thread_loop fork_exit fork_trampoline 5 100094 zfskern txg_thread_enter mi_switch sleepq_wait _cv_wait txg_thread_wait txg_quiesce_thread fork_exit fork_trampoline 5 100095 zfskern txg_thread_enter mi_switch sleepq_wait _cv_wait zio_wait dsl_pool_sync spa_sync txg_sync_thread fork_exit fork_trampoline (I have the full procstat -k output for those PIDs if needed) Other processes, such as my hourly zfs snapshots appear to be wedged: root 7407 0.0 0.0 14672 1352 ?? D 10:00AM 0:00.46 /sbin/zfs snapshot -r pool0@2011-02-27_10.00.01--1d root 10184 0.0 0.0 14672 1444 ?? D 11:00AM 0:00.36 /sbin/zfs snapshot -r pool0@2011-02-27_11.00.00--1d root 12938 0.0 0.0 14672 1516 ?? D 12:00PM 0:00.11 /sbin/zfs snapshot -r pool0@2011-02-27_12.00.01--1d PID TID COMM TDNAME KSTACK 7407 100563 zfs - mi_switch sleepq_wait _cv_wait txg_wait_synced dsl_sync_task_group_wait dmu_objset_snapshot zfs_ioc_snapshot zfsdev_ioctl devfs_ioctl_f kern_ioctl ioctl syscallenter syscall Xfast_syscall 10184 100707 zfs - mi_switch sleepq_wait _cv_wait txg_wait_synced dsl_sync_task_group_wait dmu_objset_snapshot zfs_ioc_snapshot zfsdev_ioctl devfs_ioctl_f kern_ioctl ioctl syscallenter syscall Xfast_syscall 12938 100159 zfs - mi_switch sleepq_wait _cv_wait txg_wait_synced dsl_sync_task_group_wait dmu_objset_snapshot zfs_ioc_snapshot zfsdev_ioctl devfs_ioctl_f kern_ioctl ioctl syscallenter syscall Xfast_syscall zfs-stats output as follows: ------------------------------------------------------------------------ ZFS Subsystem Report Sun Feb 27 12:20:20 2011 ------------------------------------------------------------------------ System Information: Kernel Version: 801501 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 FreeBSD 8.2-PRERELEASE #3: Mon Dec 20 20:54:55 GMT 2010 tdb 12:23pm up 68 days, 14:07, 2 users, load averages: 0.35, 0.39, 0.35 ------------------------------------------------------------------------ System Memory Statistics: Physical Memory: 3061.63M Kernel Memory: 1077.46M DATA: 99.12% 1067.93M TEXT: 0.88% 9.53M ------------------------------------------------------------------------ ZFS pool information: Storage pool Version (spa): 15 Filesystem Version (zpl): 4 ------------------------------------------------------------------------ ARC Misc: Deleted: 148418216 Recycle Misses: 51095797 Mutex Misses: 370820 Evict Skips: 370820 ARC Size: Current Size (arcsize): 55.86% 1087.64M Target Size (Adaptive, c): 56.50% 1100.22M Min Size (Hard Limit, c_min): 12.50% 243.40M Max Size (High Water, c_max): ~8:1 1947.20M ARC Size Breakdown: Recently Used Cache Size (p): 6.25% 68.77M Freq. Used Cache Size (c-p): 93.75% 1031.45M ARC Hash Breakdown: Elements Max: 398079 Elements Current: 38.65% 153870 Collisions: 230805591 Chain Max: 34 Chains: 24344 ARC Eviction Statistics: Evicts Total: 4560897494528 Evicts Eligible for L2: 99.99% 4560573588992 Evicts Ineligible for L2: 0.01% 323905536 Evicts Cached to L2: 0 ARC Efficiency: Cache Access Total: 1761824967 Cache Hit Ratio: 84.82% 1494437389 Cache Miss Ratio: 15.18% 267387578 Actual Hit Ratio: 84.82% 1494411236 Data Demand Efficiency: 83.35% CACHE HITS BY CACHE LIST: Most Recently Used (mru): 7.86% 117410213 Most Frequently Used (mfu): 92.14% 1377001023 MRU Ghost (mru_ghost): 0.63% 9445180 MFU Ghost (mfu_ghost): 7.99% 119349696 CACHE HITS BY DATA TYPE: Demand Data: 35.75% 534254771 Prefetch Data: 0.00% 0 Demand Metadata: 64.25% 960153880 Prefetch Metadata: 0.00% 28738 CACHE MISSES BY DATA TYPE: Demand Data: 39.91% 106712177 Prefetch Data: 0.00% 0 Demand Metadata: 60.01% 160446249 Prefetch Metadata: 0.09% 229152 ------------------------------------------------------------------------ VDEV Cache Summary: Access Total: 155663083 Hits Ratio: 70.91% 110387854 Miss Ratio: 29.09% 45275229 Delegations: 91183 ------------------------------------------------------------------------ ZFS Tunable (sysctl): kern.maxusers=384 vfs.zfs.l2c_only_size=0 vfs.zfs.mfu_ghost_data_lsize=23343104 vfs.zfs.mfu_ghost_metadata_lsize=302204928 vfs.zfs.mfu_ghost_size=325548032 vfs.zfs.mfu_data_lsize=524091904 vfs.zfs.mfu_metadata_lsize=52224 vfs.zfs.mfu_size=533595136 vfs.zfs.mru_ghost_data_lsize=30208 vfs.zfs.mru_ghost_metadata_lsize=727952896 vfs.zfs.mru_ghost_size=727983104 vfs.zfs.mru_data_lsize=100169216 vfs.zfs.mru_metadata_lsize=0 vfs.zfs.mru_size=339522048 vfs.zfs.anon_data_lsize=0 vfs.zfs.anon_metadata_lsize=0 vfs.zfs.anon_size=10959360 vfs.zfs.l2arc_norw=1 vfs.zfs.l2arc_feed_again=1 vfs.zfs.l2arc_noprefetch=0 vfs.zfs.l2arc_feed_min_ms=200 vfs.zfs.l2arc_feed_secs=1 vfs.zfs.l2arc_headroom=2 vfs.zfs.l2arc_write_boost=8388608 vfs.zfs.l2arc_write_max=8388608 vfs.zfs.arc_meta_limit=510447616 vfs.zfs.arc_meta_used=513363680 vfs.zfs.mdcomp_disable=0 vfs.zfs.arc_min=255223808 vfs.zfs.arc_max=2041790464 vfs.zfs.zfetch.array_rd_sz=1048576 vfs.zfs.zfetch.block_cap=256 vfs.zfs.zfetch.min_sec_reap=2 vfs.zfs.zfetch.max_streams=8 vfs.zfs.prefetch_disable=1 vfs.zfs.check_hostid=1 vfs.zfs.recover=0 vfs.zfs.txg.write_limit_override=0 vfs.zfs.txg.synctime=5 vfs.zfs.txg.timeout=30 vfs.zfs.scrub_limit=10 vfs.zfs.vdev.cache.bshift=16 vfs.zfs.vdev.cache.size=10485760 vfs.zfs.vdev.cache.max=16384 vfs.zfs.vdev.aggregation_limit=131072 vfs.zfs.vdev.ramp_rate=2 vfs.zfs.vdev.time_shift=6 vfs.zfs.vdev.min_pending=4 vfs.zfs.vdev.max_pending=10 vfs.zfs.cache_flush_disable=0 vfs.zfs.zil_disable=0 vfs.zfs.zio.use_uma=0 vfs.zfs.version.zpl=4 vfs.zfs.version.spa=15 vfs.zfs.version.dmu_backup_stream=1 vfs.zfs.version.dmu_backup_header=2 vfs.zfs.version.acl=1 vfs.zfs.debug=0 vfs.zfs.super_owner=0 vm.kmem_size=3115532288 vm.kmem_size_scale=1 vm.kmem_size_min=0 vm.kmem_size_max=329853485875 ------------------------------------------------------------------------ I hope somebody can give me some pointers on where to go with this. I'm just about to reboot (when it unwedges) and upgrade to the latest 8-STABLE to see if that helps. Thanks, Tim.