Date: Sun, 27 Feb 2011 12:32:17 +0000 From: Tim Bishop <tim.bishop@gmail.com> To: freebsd-fs@freebsd.org Subject: ZFS system unresponsive Message-ID: <AANLkTinmbYRWz8kG=e1AECpj51cNRvjt2MjjCXixYxjU@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi all, I have a ZFS system that has become unresponsive. It's running amd64 8-STABLE as of approximately 20 Dec. It has a UFS-based root file system and then a ZFS mirror for a handful of jails. It seems to get in to this state occasionally, but eventually can unblock itself. This may take hours though. top -HSj shows the following processes active: PID JID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 0 0 root -16 0 0K 1456K zio->i 0 28.9H 7.08% {zio_write_issue} 5 0 root -8 - 0K 60K zio->i 0 776:59 0.29% {txg_thread_enter} A procstat on those processes shows: 0 100068 kernel zio_write_issue mi_switch sleepq_wait _cv_wait zio_wait dmu_buf_hold_array_by_dnode dmu_read space_map_load metaslab_activate metaslab_alloc zio_dva_allocate zio_execute taskq_run_safe taskqueue_run_locked taskqueue_thread_loop fork_exit fork_trampoline 5 100094 zfskern txg_thread_enter mi_switch sleepq_wait _cv_wait txg_thread_wait txg_quiesce_thread fork_exit fork_trampoline 5 100095 zfskern txg_thread_enter mi_switch sleepq_wait _cv_wait zio_wait dsl_pool_sync spa_sync txg_sync_thread fork_exit fork_trampoline (I have the full procstat -k output for those PIDs if needed) Other processes, such as my hourly zfs snapshots appear to be wedged: root 7407 0.0 0.0 14672 1352 ?? D 10:00AM 0:00.46 /sbin/zfs snapshot -r pool0@2011-02-27_10.00.01--1d root 10184 0.0 0.0 14672 1444 ?? D 11:00AM 0:00.36 /sbin/zfs snapshot -r pool0@2011-02-27_11.00.00--1d root 12938 0.0 0.0 14672 1516 ?? D 12:00PM 0:00.11 /sbin/zfs snapshot -r pool0@2011-02-27_12.00.01--1d PID TID COMM TDNAME KSTACK 7407 100563 zfs - mi_switch sleepq_wait _cv_wait txg_wait_synced dsl_sync_task_group_wait dmu_objset_snapshot zfs_ioc_snapshot zfsdev_ioctl devfs_ioctl_f kern_ioctl ioctl syscallenter syscall Xfast_syscall 10184 100707 zfs - mi_switch sleepq_wait _cv_wait txg_wait_synced dsl_sync_task_group_wait dmu_objset_snapshot zfs_ioc_snapshot zfsdev_ioctl devfs_ioctl_f kern_ioctl ioctl syscallenter syscall Xfast_syscall 12938 100159 zfs - mi_switch sleepq_wait _cv_wait txg_wait_synced dsl_sync_task_group_wait dmu_objset_snapshot zfs_ioc_snapshot zfsdev_ioctl devfs_ioctl_f kern_ioctl ioctl syscallenter syscall Xfast_syscall zfs-stats output as follows: ------------------------------------------------------------------------ ZFS Subsystem Report Sun Feb 27 12:20:20 2011 ------------------------------------------------------------------------ System Information: Kernel Version: 801501 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 FreeBSD 8.2-PRERELEASE #3: Mon Dec 20 20:54:55 GMT 2010 tdb 12:23pm up 68 days, 14:07, 2 users, load averages: 0.35, 0.39, 0.35 ------------------------------------------------------------------------ System Memory Statistics: Physical Memory: 3061.63M Kernel Memory: 1077.46M DATA: 99.12% 1067.93M TEXT: 0.88% 9.53M ------------------------------------------------------------------------ ZFS pool information: Storage pool Version (spa): 15 Filesystem Version (zpl): 4 ------------------------------------------------------------------------ ARC Misc: Deleted: 148418216 Recycle Misses: 51095797 Mutex Misses: 370820 Evict Skips: 370820 ARC Size: Current Size (arcsize): 55.86% 1087.64M Target Size (Adaptive, c): 56.50% 1100.22M Min Size (Hard Limit, c_min): 12.50% 243.40M Max Size (High Water, c_max): ~8:1 1947.20M ARC Size Breakdown: Recently Used Cache Size (p): 6.25% 68.77M Freq. Used Cache Size (c-p): 93.75% 1031.45M ARC Hash Breakdown: Elements Max: 398079 Elements Current: 38.65% 153870 Collisions: 230805591 Chain Max: 34 Chains: 24344 ARC Eviction Statistics: Evicts Total: 4560897494528 Evicts Eligible for L2: 99.99% 4560573588992 Evicts Ineligible for L2: 0.01% 323905536 Evicts Cached to L2: 0 ARC Efficiency: Cache Access Total: 1761824967 Cache Hit Ratio: 84.82% 1494437389 Cache Miss Ratio: 15.18% 267387578 Actual Hit Ratio: 84.82% 1494411236 Data Demand Efficiency: 83.35% CACHE HITS BY CACHE LIST: Most Recently Used (mru): 7.86% 117410213 Most Frequently Used (mfu): 92.14% 1377001023 MRU Ghost (mru_ghost): 0.63% 9445180 MFU Ghost (mfu_ghost): 7.99% 119349696 CACHE HITS BY DATA TYPE: Demand Data: 35.75% 534254771 Prefetch Data: 0.00% 0 Demand Metadata: 64.25% 960153880 Prefetch Metadata: 0.00% 28738 CACHE MISSES BY DATA TYPE: Demand Data: 39.91% 106712177 Prefetch Data: 0.00% 0 Demand Metadata: 60.01% 160446249 Prefetch Metadata: 0.09% 229152 ------------------------------------------------------------------------ VDEV Cache Summary: Access Total: 155663083 Hits Ratio: 70.91% 110387854 Miss Ratio: 29.09% 45275229 Delegations: 91183 ------------------------------------------------------------------------ ZFS Tunable (sysctl): kern.maxusers=384 vfs.zfs.l2c_only_size=0 vfs.zfs.mfu_ghost_data_lsize=23343104 vfs.zfs.mfu_ghost_metadata_lsize=302204928 vfs.zfs.mfu_ghost_size=325548032 vfs.zfs.mfu_data_lsize=524091904 vfs.zfs.mfu_metadata_lsize=52224 vfs.zfs.mfu_size=533595136 vfs.zfs.mru_ghost_data_lsize=30208 vfs.zfs.mru_ghost_metadata_lsize=727952896 vfs.zfs.mru_ghost_size=727983104 vfs.zfs.mru_data_lsize=100169216 vfs.zfs.mru_metadata_lsize=0 vfs.zfs.mru_size=339522048 vfs.zfs.anon_data_lsize=0 vfs.zfs.anon_metadata_lsize=0 vfs.zfs.anon_size=10959360 vfs.zfs.l2arc_norw=1 vfs.zfs.l2arc_feed_again=1 vfs.zfs.l2arc_noprefetch=0 vfs.zfs.l2arc_feed_min_ms=200 vfs.zfs.l2arc_feed_secs=1 vfs.zfs.l2arc_headroom=2 vfs.zfs.l2arc_write_boost=8388608 vfs.zfs.l2arc_write_max=8388608 vfs.zfs.arc_meta_limit=510447616 vfs.zfs.arc_meta_used=513363680 vfs.zfs.mdcomp_disable=0 vfs.zfs.arc_min=255223808 vfs.zfs.arc_max=2041790464 vfs.zfs.zfetch.array_rd_sz=1048576 vfs.zfs.zfetch.block_cap=256 vfs.zfs.zfetch.min_sec_reap=2 vfs.zfs.zfetch.max_streams=8 vfs.zfs.prefetch_disable=1 vfs.zfs.check_hostid=1 vfs.zfs.recover=0 vfs.zfs.txg.write_limit_override=0 vfs.zfs.txg.synctime=5 vfs.zfs.txg.timeout=30 vfs.zfs.scrub_limit=10 vfs.zfs.vdev.cache.bshift=16 vfs.zfs.vdev.cache.size=10485760 vfs.zfs.vdev.cache.max=16384 vfs.zfs.vdev.aggregation_limit=131072 vfs.zfs.vdev.ramp_rate=2 vfs.zfs.vdev.time_shift=6 vfs.zfs.vdev.min_pending=4 vfs.zfs.vdev.max_pending=10 vfs.zfs.cache_flush_disable=0 vfs.zfs.zil_disable=0 vfs.zfs.zio.use_uma=0 vfs.zfs.version.zpl=4 vfs.zfs.version.spa=15 vfs.zfs.version.dmu_backup_stream=1 vfs.zfs.version.dmu_backup_header=2 vfs.zfs.version.acl=1 vfs.zfs.debug=0 vfs.zfs.super_owner=0 vm.kmem_size=3115532288 vm.kmem_size_scale=1 vm.kmem_size_min=0 vm.kmem_size_max=329853485875 ------------------------------------------------------------------------ I hope somebody can give me some pointers on where to go with this. I'm just about to reboot (when it unwedges) and upgrade to the latest 8-STABLE to see if that helps. Thanks, Tim.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTinmbYRWz8kG=e1AECpj51cNRvjt2MjjCXixYxjU>