From owner-freebsd-current@FreeBSD.ORG  Fri Dec 18 08:15:24 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7EFCA106566B
	for <freebsd-current@freebsd.org>; Fri, 18 Dec 2009 08:15:24 +0000 (UTC)
	(envelope-from alexz@visp.ru)
Received: from mail.visp.ru (srv1.visp.ru [91.215.204.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 028A98FC0A
	for <freebsd-current@freebsd.org>; Fri, 18 Dec 2009 08:15:23 +0000 (UTC)
Received: from 91-215-205-255.static.visp.ru ([91.215.205.255] helo=zagrebin)
	by mail.visp.ru with esmtp (Exim 4.66 (FreeBSD))
	(envelope-from <alexz@visp.ru>)
	id 1NLXzc-000Ow8-D2; Fri, 18 Dec 2009 11:15:20 +0300
From: "Alexander Zagrebin" <alexz@visp.ru>
To: "'Benjamin Close'" <Benjamin.Close@clearchain.com>,
	<freebsd-current@freebsd.org>
References: <39309F560B98453EBB9AEA0F29D9D80E@vosz.local>
	<4B2A341C.5000802@clearchain.com>
Date: Fri, 18 Dec 2009 11:15:19 +0300
Message-ID: <6D3B0162A2134CAEA9F4DF5BC03707AA@vosz.local>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="koi8-r"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: Acp/Hh1AKfOChFi2RS69+nZgEwxnhgAjGiVQ
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
In-Reply-To: <4B2A341C.5000802@clearchain.com>
Cc: 
Subject: RE: 8.0-RELEASE: disk IO temporarily hangs up (ZFS or ATA related
	problem)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Dec 2009 08:15:24 -0000

Big thanks for your reply!

> > I use onboard ICH7 SATA controller with two disks attached:
> >
> > atapci1:<Intel ICH7 SATA300 controller>  port
> > 
> 0x30c8-0x30cf,0x30ec-0x30ef,0x30c0-0x30c7,0x30e8-0x30eb,0x30a0
> -0x30af irq 19
> > at device 31.2 on pci0
> > atapci1: [ITHREAD]
> > ata2:<ATA channel 0>  on atapci1
> > ata2: [ITHREAD]
> > ata3:<ATA channel 1>  on atapci1
> > ata3: [ITHREAD]
> > ad4: 1430799MB<Seagate ST31500541AS CC34>  at ata2-master SATA150
> > ad6: 1430799MB<WDC WD15EADS-00P8B0 01.00A01>  at ata3-master SATA150
> >
> > The disks are used for mirrored ZFS pool.
> > I have noticed that the system periodically locks up on 
> disk operations.
> > After approx. 10 min of very slow disk i/o (several KB/s) 
> the speed of disk
> > operations restores to normal.
> > gstat has shown that the problem is in ad6.
> > For example, there is a filtered output of iostat -x 1:
> >
> >                          extended device statistics
> > device     r/s   w/s    kr/s    kw/s wait svc_t  %b
> > ad6      985.1   0.0  5093.9     0.0    0   0.2  23
> > ad6      761.8   0.0  9801.3     0.0    1   0.4  31
> > ad6      698.7   0.0  9215.1     0.0    0   0.4  30
> > ad6      434.2 513.9  5903.1 13658.3   48  10.2  55
> > ad6        3.0 762.8   191.2 28732.3    0  57.6  99
> > ad6       10.0   4.0   163.9     4.0    1   1.6   2
> >
> > Before this line we have a normal operations.
> > Then the behaviour of ad6 changes (pay attention to high 
> average access time
> > and percent of "busy" significantly greater than 100):
> >
> > ad6        0.0   0.0     0.0     0.0    1   0.0   0
> > ad6        1.0   0.0     0.5     0.0    1 1798.3 179
> > ad6        1.0   0.0     1.5     0.0    1 1775.4 177
> > ad6        0.0   0.0     0.0     0.0    1   0.0   0
> > ad6       10.0   0.0    75.2     0.0    1 180.3 180
> > ad6        0.0   0.0     0.0     0.0    1   0.0   0
> > ad6        1.0   0.0     2.0     0.0    1 1786.7 178
> > ad6        0.0   0.0     0.0     0.0    1   0.0   0
> >
> > And so on for about 10 minutes.
> > Then the disk i/o is reverted to normal:
> >
> > ad6      139.4   0.0  8860.5     0.0    1   4.4  61
> > ad6      167.3   0.0 10528.7     0.0    1   3.3  55
> > ad6       60.8 411.5  3707.6  8574.8    1  19.6  87
> > ad6      163.4   0.0 10334.9     0.0    1   4.4  72
> > ad6      157.4   0.0  9770.7     0.0    1   5.0  78
> > ad6      108.5   0.0  6886.8     0.0    0   3.9  43
> >
> > There are no ata error messages neither in the system log, 
> nor on the
> > console.
> > The manufacture's diagnostic test is passed on ad6 without 
> any errors.
> > The ad6 also contains swap partition.
> > I have tried to run several (10..20) instances of dd, which 
> read and write
> > data
> > from and to the swap partition simultaneously, but it has 
> not called the
> > lockup.
> > So there is a probability that this problem is ZFS related.
> >
> > I have been forced to switch ad6 to the offline state... :(
> >
> > Any suggestions on this problem?
> >    
> I also have been experiencing the same problem with a different 
> disk/controller (via mpt on a vmware machine). During the 
> same period I 
> notice that system cpu usage hits 80+% and top shows the 
> zfskern process 
> being the main culprit. At the same time I've discovered the 
> kstat.zfs.misc.arcstats.memory_throttle_count sysctl rising. 
> Arc is also 
> normally close to the arc_max limit.

My case has differences. 
1. CPU usage is near 0%
2. zfs's sysctls doesn't change significantly during
   "normal operation" -> "lockup" -> "normal" transition
3. ARC size is far from its limits,
kstat.zfs.misc.arcstats.memory_throttle_count: 0

Here my actions, observations and conclusions:
1. I have tried to change placements of disks on sata channels.
   Nothing has changed - the problems still on WD15EADS, although it became
ad4. 
   So issue isn't in south bridge, sata cables and so on.
2. I have tried to detach ad6 from the pool, to zero system area, and to
reattach it again.
   Of course, resilvering was started. During resilvering 250 GB was copied
without lockups
   and delays. While resilvering, I have tried periodically to load drive
with a read
   operations (dd if=/dev/ad6 of=/dev/null ...).
   But after resilvering and several minutes of normal mirror operation,
lockups appeared again.
   So drive is seems to be ok and we have a software problem?
3. I have noticed that lockups often happens during postgresql activity.
   postgresql often uses sync. So I have tried to disable ZIL.
   No success.
4. "IDE LED" is constantly on during lockups.
   So it is really read/write delays. 
5. I see two variants of zfskern's state: 
   a) it is constantly in the vgeom:io
   b) it is in either zio->io_ state (when active), or in tx->tx_s (when
idle).
      During lockups it is mostly in zio->io_.
   What the difference with vgeom:io and zio->io_/tx->tx_s?

May be a problem is in ata? WD15EADS is a "green" series of drives.
May be i have a problem with its power management?
Is there a method to completely reset sata channel and drive?
atacontrol reinit will do it?

Any help is welcomed.

-- 
Alexander Zagrebin