From owner-freebsd-stable@FreeBSD.ORG Sun Sep 5 19:42:06 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FFB410656C7 for ; Sun, 5 Sep 2010 19:42:06 +0000 (UTC) (envelope-from doublef-ctm@yandex.ru) Received: from forward3.mail.yandex.net (forward3.mail.yandex.net [77.88.46.8]) by mx1.freebsd.org (Postfix) with ESMTP id 22F568FC12 for ; Sun, 5 Sep 2010 19:42:05 +0000 (UTC) Received: from smtp2.mail.yandex.net (smtp2.mail.yandex.net [77.88.46.102]) by forward3.mail.yandex.net (Yandex) with ESMTP id 818E356D8E42; Sun, 5 Sep 2010 23:28:18 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1283714898; bh=rKQ33K5t3fAll7hzAD9UNOpzWwRbcLrqftB/OWolSso=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:In-Reply-To; b=TLzB6Yaz3ep6gb8Qgrti2br3Yf/irfgdZwKthLYH1HcZ0n/fzkPkTz/+QCdf3PqTO 4t4RNonfB9fqHHP4eVspK1jTVcVwXFGBtVBXmLRMiltw5cym8dIdr5MrtgmvewJjun A2kWLz3V1/521h9xz5z/kheH2izIgQIK/nWwKeRA= Received: from nautilus (unknown [178.155.56.14]) by smtp2.mail.yandex.net (Yandex) with ESMTPA id 2DB9F52806B; Sun, 5 Sep 2010 23:28:18 +0400 (MSD) Received: by nautilus (Postfix, from userid 1001) id 558501DD43D; Sun, 5 Sep 2010 23:28:16 +0400 (MSD) Date: Sun, 5 Sep 2010 23:28:16 +0400 From: Sergey Zaharchenko To: freebsd-stable@freebsd.org Message-ID: <20100905192816.GA9110@nautilus.vmks.ru> References: <20100821220435.GA6208@carrick-users.bishnet.net> <20100821222429.GB73221@dan.emsphone.com> <20100831133556.GB45316@carrick-users.bishnet.net> <20100831155829.GC5913@dan.emsphone.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="+QahgC5+KEYLbs62" Content-Disposition: inline In-Reply-To: <20100831155829.GC5913@dan.emsphone.com> X-Listening-To: Silence User-Agent: Mutt/1.5.20 (2009-06-14) X-Yandex-TimeMark: 1283714898 X-Yandex-Spam: 1 X-Yandex-Front: smtp2.mail.yandex.net Cc: Dan Nelson Subject: Re: 8.1R ZFS almost locking up system X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Sep 2010 19:42:06 -0000 --+QahgC5+KEYLbs62 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello list! Thu, Jan 01, 1970 at 12:00:00AM +0000 Dan Nelson wrote: > In the last episode (Aug 31), Tim Bishop said: > > On Sat, Aug 21, 2010 at 05:24:29PM -0500, Dan Nelson wrote: > > > In the last episode (Aug 21), Tim Bishop said: > > > > A few items from top, including zfskern: > > > >=20 > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU = COMMAND > > > > 5 root 4 -8 - 0K 60K zio->i 0 54:38 3.47% = zfskern > > > > 91775 70 1 44 0 53040K 31144K tx->tx 1 2:11 0.00% = postgres > > > > 39661 tdb 1 44 0 55776K 32968K tx->tx 0 0:39 0.00% = mutt > > > > 14828 root 1 47 0 14636K 1572K tx->tx 1 0:03 0.00% = zfs > > > > 11188 root 1 51 0 14636K 1572K tx->tx 0 0:03 0.00% = zfs > > > >=20 I'm seeing a similar problem on a remote server (unpatched 8.1-RELEASE, amd64, quad-core, raidz pool on 8 drives). However, it's also different in what seems to be important. Please advise. Portions of top: CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 75M Active, 31M Inact, 1352M Wired, 136K Cache, 812M Buf, 6458M Free 35769 root 1 52 0 2740K 744K zilog- 1 0:00 0.00% sync 35625 df 1 44 0 14636K 1856K zio->i 2 0:00 0.00% zfs 35920 root 1 44 0 15668K 1836K scl->s 0 0:00 0.00% zpo= ol 35607 root 1 44 0 8260K 2284K zio->i 0 0:00 0.00% csh > > 0 100084 kernel zfs_vn_rele_task mi_switch+0x16f sleepq_w= ait+0x42 _sleep+0x31c taskqueue_thread_loop+0xb7 fork_exit+0x118 fork_tramp= oline+0xe=20 > > 5 100031 zfskern arc_reclaim_thre mi_switch+0x16f sleepq_t= imedwait+0x42 _cv_timedwait+0x129 arc_reclaim_thread+0x2d1 fork_exit+0x118 = fork_trampoline+0xe=20 > > 5 100032 zfskern l2arc_feed_threa mi_switch+0x16f sleepq_t= imedwait+0x42 _cv_timedwait+0x129 l2arc_feed_thread+0x1be fork_exit+0x118 f= ork_trampoline+0xe=20 > > 5 100085 zfskern txg_thread_enter mi_switch+0x16f sleepq_w= ait+0x42 _cv_wait+0x111 txg_thread_wait+0x79 txg_quiesce_thread+0xb5 fork_e= xit+0x118 fork_trampoline+0xe=20 > > 5 100086 zfskern txg_thread_enter mi_switch+0x16f sleepq_w= ait+0x42 _cv_wait+0x111 zio_wait+0x61 dsl_pool_sync+0xea spa_sync+0x355 txg= _sync_thread+0x195 fork_exit+0x118 fork_trampoline+0xe=20 procstat -kk -a: 0 100114 kernel zfs_vn_rele_task mi_switch+0x16f sleepq_wait+= 0x42 _sleep+0x31c taskqueue_thread_loop+0xb7 fork_exit+0x118 fork_trampolin= e+0xe 0 100123 kernel zil_clean mi_switch+0x16f sleepq_wait+= 0x42 _sleep+0x31c taskqueue_thread_loop+0xb7 fork_exit+0x118 fork_trampolin= e+0xe 16 100065 syncer - mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 zio_wait+0x61 zil_commit+0x3e1 zfs_sync+0xa6 sync_fsync= +0x184 sync_vnode+0x16b sched_sync+0x1c9 fork_exit+0x118 fork_trampoline+0xe 39 100079 zfskern arc_reclaim_thre mi_switch+0x16f sleepq_timed= wait+0x42 _cv_timedwait+0x129 arc_reclaim_thread+0x2d1 fork_exit+0x118 fork= _trampoline+0xe 39 100086 zfskern l2arc_feed_threa mi_switch+0x16f sleepq_timed= wait+0x42 _cv_timedwait+0x129 l2arc_feed_thread+0x1be fork_exit+0x118 fork_= trampoline+0xe 39 100115 zfskern txg_thread_enter mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 txg_thread_wait+0x79 txg_quiesce_thread+0xb5 fork_exit+= 0x118 fork_trampoline+0xe 39 100116 zfskern txg_thread_enter mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 zio_wait+0x61 dsl_pool_sync+0xea spa_sync+0x355 txg_syn= c_thread+0x195 35625 100151 zfs - mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 zio_wait+0x61 dbuf_read+0x39a dmu_buf_hold+0xcc zap_loc= kdir+0x52 zap_cursor_retrieve+0x194 dsl_prop_get_all+0x187 zfs_ioc_objset_s= tats+0x7b zfsdev_ioctl+0x8d devfs_ioctl_f+0x77 kern_ioctl+0xf6 ioctl+0xfd s= yscall+0x1e7 35769 100212 sync - mi_switch+0x16f sleepq_wait+= 0x42 _cv_wait+0x111 zil_commit+0x7a zfs_sync+0xa6 sync+0x20e syscall+0x1e7 = Xfast_syscall+0xe1 etc. etc... Seems pretty identical... But gstat seems different: dT: 1.008s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name =2E.. 3 0 0 0 0.0 0 0 0.0 0.0 da0 3 0 0 0 0.0 0 0 0.0 0.0 da1 3 0 0 0 0.0 0 0 0.0 0.0 da2 35 0 0 0 0.0 0 0 0.0 0.0 da3 3 0 0 0 0.0 0 0 0.0 0.0 da4 2 0 0 0 0.0 0 0 0.0 0.0 da5 3 0 0 0 0.0 0 0 0.0 0.0 da6 35 0 0 0 0.0 0 0 0.0 0.0 da7 Note the huge wait queues... I've tried reading from the disks (dd), camcontrol inquiry'ing them, but all of these methods fail: %dd if=3D/dev/da3 of=3D/dev/null bs=3D1024k load: 0.00 cmd: dd 37770 [physrd] 2.67r 0.00u 0.00s 0% 1168k [and nothing input/output] So, one could suspect a hardware problem. FWIW the RAID h/w is a RocketRaid: hptiop0: adapter at PCI 5:0:0, IRQ 16 hptiop0: mem 0xdd800000-0xddffffff irq 16 at device 0.0 on pci5 hptiop0: 0 RocketRAID 3xxx/4xxx controller driver v1.3 (010208) hptiop0: [GIANT-LOCKED] hptiop0: [ITHREAD] > if your FS has been near 99%, you may not have any large runs > of freespace left. <3GB out of 12TB is used in my case. > L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name > 1 48 48 3042 9.8 0 0 0.0 47.6| ad4 > 0 38 38 2406 10.5 0 0 0.0 39.5| ad6 >=20 > You have a pair of mirrored disks, each doing around 40% I/O load, which = is > 80% load if a single-threaded task is driving all the I/O. No load at all in my case... > I see the syncer > process is also trying to write to the ZIL. Are you running something th= at > does a lot of fsync calls (a database server for example)? There's a postgresql database there but it was inactive at the time the first hung processes showed up. > Is this system an NFS server maybe? No. Peculiarities include using ez-jail for a couple of jails over ZFS (also using nullfs). > Try setting the sysctl vfs.zfs.zil_disable=3D1 and see > if your performance improves. Sure enough, it didn't help the existing processes, but I'll try it after a reboot. How likely do you think a hardware error is in this setup? Can I do anything else to help diagnose this problem? Thanks for any other input, --=20 DoubleF --+QahgC5+KEYLbs62 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkyD708ACgkQwo7hT/9lVdxJLgCfdlkQXw85usOxvR/l1StLM+N3 X1oAmwXy551/MeYxLNNzooVrDAJdzRyp =EgCL -----END PGP SIGNATURE----- --+QahgC5+KEYLbs62--