Date: Sun, 5 Sep 2010 23:28:16 +0400 From: Sergey Zaharchenko <doublef-ctm@yandex.ru> To: freebsd-stable@freebsd.org Cc: Dan Nelson <dnelson@allantgroup.com> Subject: Re: 8.1R ZFS almost locking up system Message-ID: <20100905192816.GA9110@nautilus.vmks.ru> In-Reply-To: <20100831155829.GC5913@dan.emsphone.com> References: <20100821220435.GA6208@carrick-users.bishnet.net> <20100821222429.GB73221@dan.emsphone.com> <20100831133556.GB45316@carrick-users.bishnet.net> <20100831155829.GC5913@dan.emsphone.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--+QahgC5+KEYLbs62 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello list! Thu, Jan 01, 1970 at 12:00:00AM +0000 Dan Nelson wrote: > In the last episode (Aug 31), Tim Bishop said: > > On Sat, Aug 21, 2010 at 05:24:29PM -0500, Dan Nelson wrote: > > > In the last episode (Aug 21), Tim Bishop said: > > > > A few items from top, including zfskern: > > > >=20 > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU = COMMAND > > > > 5 root 4 -8 - 0K 60K zio->i 0 54:38 3.47% = zfskern > > > > 91775 70 1 44 0 53040K 31144K tx->tx 1 2:11 0.00% = postgres > > > > 39661 tdb 1 44 0 55776K 32968K tx->tx 0 0:39 0.00% = mutt > > > > 14828 root 1 47 0 14636K 1572K tx->tx 1 0:03 0.00% = zfs > > > > 11188 root 1 51 0 14636K 1572K tx->tx 0 0:03 0.00% = zfs > > > >=20 I'm seeing a similar problem on a remote server (unpatched 8.1-RELEASE, amd64, quad-core, raidz pool on 8 drives). However, it's also different in what seems to be important. Please advise. Portions of top: CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 75M Active, 31M Inact, 1352M Wired, 136K Cache, 812M Buf, 6458M Free 35769 root 1 52 0 2740K 744K zilog- 1 0:00 0.00% sync 35625 df 1 44 0 14636K 1856K zio->i 2 0:00 0.00% zfs 35920 root 1 44 0 15668K 1836K scl->s 0 0:00 0.00% zpo= ol 35607 root 1 44 0 8260K 2284K zio->i 0 0:00 0.00% csh > > 0 100084 kernel zfs_vn_rele_task mi_switch+0x16f sleepq_w= ait+0x42 _sleep+0x31c taskqueue_thread_loop+0xb7 fork_exit+0x118 fork_tramp= oline+0xe=20 > > 5 100031 zfskern arc_reclaim_thre mi_switch+0x16f sleepq_t= imedwait+0x42 _cv_timedwait+0x129 arc_reclaim_thread+0x2d1 fork_exit+0x118 = fork_trampoline+0xe=20 > > 5 100032 zfskern l2arc_feed_threa mi_switch+0x16f sleepq_t= imedwait+0x42 _cv_timedwait+0x129 l2arc_feed_thread+0x1be fork_exit+0x118 f= ork_trampoline+0xe=20 > > 5 100085 zfskern txg_thread_enter mi_switch+0x16f sleepq_w= ait+0x42 _cv_wait+0x111 txg_thread_wait+0x79 txg_quiesce_thread+0xb5 fork_e= xit+0x118 fork_trampoline+0xe=20 > > 5 100086 zfskern txg_thread_enter mi_switch+0x16f sleepq_w= ait+0x42 _cv_wait+0x111 zio_wait+0x61 dsl_pool_sync+0xea spa_sync+0x355 txg= _sync_thread+0x195 fork_exit+0x118 fork_trampoline+0xe=20 procstat -kk -a: 0 100114 kernel zfs_vn_rele_task mi_switch+0x16f sleepq_wait+= 0x42 _sleep+0x31c taskqueue_thread_loop+0xb7 fork_exit+0x118 fork_trampolin= e+0xe 0 100123 kernel zil_clean mi_switch+0x16f sleepq_wait+= 0x42 _sleep+0x31c taskqueue_thread_loop+0xb7 fork_exit+0x118 fork_trampolin= e+0xe 16 100065 syncer - mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 zio_wait+0x61 zil_commit+0x3e1 zfs_sync+0xa6 sync_fsync= +0x184 sync_vnode+0x16b sched_sync+0x1c9 fork_exit+0x118 fork_trampoline+0xe 39 100079 zfskern arc_reclaim_thre mi_switch+0x16f sleepq_timed= wait+0x42 _cv_timedwait+0x129 arc_reclaim_thread+0x2d1 fork_exit+0x118 fork= _trampoline+0xe 39 100086 zfskern l2arc_feed_threa mi_switch+0x16f sleepq_timed= wait+0x42 _cv_timedwait+0x129 l2arc_feed_thread+0x1be fork_exit+0x118 fork_= trampoline+0xe 39 100115 zfskern txg_thread_enter mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 txg_thread_wait+0x79 txg_quiesce_thread+0xb5 fork_exit+= 0x118 fork_trampoline+0xe 39 100116 zfskern txg_thread_enter mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 zio_wait+0x61 dsl_pool_sync+0xea spa_sync+0x355 txg_syn= c_thread+0x195 35625 100151 zfs - mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 zio_wait+0x61 dbuf_read+0x39a dmu_buf_hold+0xcc zap_loc= kdir+0x52 zap_cursor_retrieve+0x194 dsl_prop_get_all+0x187 zfs_ioc_objset_s= tats+0x7b zfsdev_ioctl+0x8d devfs_ioctl_f+0x77 kern_ioctl+0xf6 ioctl+0xfd s= yscall+0x1e7 35769 100212 sync - mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 zil_commit+0x7a zfs_sync+0xa6 sync+0x20e syscall+0x1e7 = Xfast_syscall+0xe1 etc. etc... Seems pretty identical... But gstat seems different: dT: 1.008s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name =2E.. 3 0 0 0 0.0 0 0 0.0 0.0 da0 3 0 0 0 0.0 0 0 0.0 0.0 da1 3 0 0 0 0.0 0 0 0.0 0.0 da2 35 0 0 0 0.0 0 0 0.0 0.0 da3 3 0 0 0 0.0 0 0 0.0 0.0 da4 2 0 0 0 0.0 0 0 0.0 0.0 da5 3 0 0 0 0.0 0 0 0.0 0.0 da6 35 0 0 0 0.0 0 0 0.0 0.0 da7 Note the huge wait queues... I've tried reading from the disks (dd), camcontrol inquiry'ing them, but all of these methods fail: %dd if=3D/dev/da3 of=3D/dev/null bs=3D1024k load: 0.00 cmd: dd 37770 [physrd] 2.67r 0.00u 0.00s 0% 1168k [and nothing input/output] So, one could suspect a hardware problem. FWIW the RAID h/w is a RocketRaid: hptiop0: adapter at PCI 5:0:0, IRQ 16 hptiop0: <RocketRAID 3540 SATA Controller > mem 0xdd800000-0xddffffff irq 16 at device 0.0 on pci5 hptiop0: 0 RocketRAID 3xxx/4xxx controller driver v1.3 (010208) hptiop0: [GIANT-LOCKED] hptiop0: [ITHREAD] > if your FS has been near 99%, you may not have any large runs > of freespace left. <3GB out of 12TB is used in my case. > L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name > 1 48 48 3042 9.8 0 0 0.0 47.6| ad4 > 0 38 38 2406 10.5 0 0 0.0 39.5| ad6 >=20 > You have a pair of mirrored disks, each doing around 40% I/O load, which = is > 80% load if a single-threaded task is driving all the I/O. No load at all in my case... > I see the syncer > process is also trying to write to the ZIL. Are you running something th= at > does a lot of fsync calls (a database server for example)? There's a postgresql database there but it was inactive at the time the first hung processes showed up. > Is this system an NFS server maybe? No. Peculiarities include using ez-jail for a couple of jails over ZFS (also using nullfs). > Try setting the sysctl vfs.zfs.zil_disable=3D1 and see > if your performance improves. Sure enough, it didn't help the existing processes, but I'll try it after a reboot. How likely do you think a hardware error is in this setup? Can I do anything else to help diagnose this problem? Thanks for any other input, --=20 DoubleF --+QahgC5+KEYLbs62 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkyD708ACgkQwo7hT/9lVdxJLgCfdlkQXw85usOxvR/l1StLM+N3 X1oAmwXy551/MeYxLNNzooVrDAJdzRyp =EgCL -----END PGP SIGNATURE----- --+QahgC5+KEYLbs62--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100905192816.GA9110>