From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 27 17:28:58 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6D50F1065670
	for <freebsd-fs@freebsd.org>; Wed, 27 Jun 2012 17:28:58 +0000 (UTC)
	(envelope-from freebsd-listen@fabiankeil.de)
Received: from smtprelay03.ispgateway.de (smtprelay03.ispgateway.de
	[80.67.31.37]) by mx1.freebsd.org (Postfix) with ESMTP id ED1358FC0C
	for <freebsd-fs@freebsd.org>; Wed, 27 Jun 2012 17:28:57 +0000 (UTC)
Received: from [78.35.159.221] (helo=fabiankeil.de)
	by smtprelay03.ispgateway.de with esmtpsa (TLSv1:AES128-SHA:128)
	(Exim 4.68) (envelope-from <freebsd-listen@fabiankeil.de>)
	id 1Sjw2s-0002ym-Fe; Wed, 27 Jun 2012 19:28:50 +0200
Date: Wed, 27 Jun 2012 19:28:43 +0200
From: Fabian Keil <freebsd-listen@fabiankeil.de>
To: Levent Serinol <lserinol@gmail.com>
Message-ID: <20120627192843.69214ea0@fabiankeil.de>
In-Reply-To: <DD521FBB-CFAD-4D51-8D0A-D21240FB30FE@gmail.com>
References: <CACqg54Si-vHFAVkjpTS40MZt4E1Kn14kDUFKmVb8vx449fCnFw@mail.gmail.com>
	<CAMXYB4K+9EKPyqdCRZZgLvDQuwK=AAGSZi8+-AfkOrnJQzwdUA@mail.gmail.com>
	<CAPS9+Ss1oCh=Szhf_qCam85hPh+MFu-XRjTEJZT5hYt12qMhXw@mail.gmail.com>
	<DD521FBB-CFAD-4D51-8D0A-D21240FB30FE@gmail.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=PGP-SHA1;
	boundary="Sig_/B2eUcb+W.ufrLFcX9jhdONG";
	protocol="application/pgp-signature"
X-Df-Sender: Nzc1MDY3
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS stalls on Heavy I/O
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Jun 2012 17:28:58 -0000

--Sig_/B2eUcb+W.ufrLFcX9jhdONG
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Levent Serinol <lserinol@gmail.com> wrote:

> On 27 Haz 2012, at 19:34, Andreas Nilsson <andrnils@gmail.com> wrote:

> > On Wed, Jun 27, 2012 at 5:50 PM, Dean Jones <dean.jones@oregonstate.edu=
> wrote:
> > On Wed, Jun 27, 2012 at 2:15 AM, Levent Serinol <lserinol@gmail.com> wr=
ote:
> > > Hi,
> > >
> > >  Under heavy I/O load we face freeze problems on ZFS volumes on both
> > > Freebsd 9 Release and 10 Current versions. Machines are HP servers (6=
4bit)
> > > with HP Smart array 6400 raid controllers (with battery units). Every=
 da
> > > device is a hardware raid5 where each one includes 9x300GB 10K SCSI h=
ard
> > > drivers. Main of I/O pattern happens on local system except some smal=
l NFS
> > > I/O from some other servers (NFS lookup/getattr/ etc.). These servers=
 are
> > > mail servers (qmail) with small I/O patterns (64K Read/Write).  Below=
 you
> > > can  find procstat output on freeze time. write_limit is set to 200MB
> > > because of the huge amount of txg_wait_opens observed before. Every p=
rocess
> > > stops on D state I think due to txg queue and other 2 queues are full=
. Is
> > > there any suggestion to fix the problem ?
> > >
> > > btw inject_compress is the main process injecting emails to user inbo=
xes
> > > (databases). Also, those machines were running without problems on
> > > Linux/XFS filesystem. For a while ago, we started  migration from Lin=
ux to
> > > Freebsd
> > >
> > >
> > > http://pastebin.com/raw.php?i=3Dic3YepWQ
> > > _______________________________________________
> >=20
> > Looks like you are running dedup with only 12 gigs of ram?
> >=20
> > Dedup is very ram hungry and the dedup tables are probably no longer
> > fitting entirely in memory and therefore the system is swapping and
> > thrashing about during writes.
> >=20
> > Also ZFS really prefers to directly address drives instead of RAID
> > controllers.  It can not guarantee or know what the controller is
> > doing behind the scenes.
> > You might want to read http://constantin.glez.de/blog/2011/07/zfs-dedup=
e-or-not-dedupe and see if you need more ram.
> >=20
> > And yes, having raid below zfs somewhat defeats the point of zfs.

> That was one the machines, i'm running several similar machines except fe=
w changes. For examplw some of them have 50gb and 20gbs of ram and some of =
them has direct access every disk itself on poil as you suggested ( pools i=
ncluding 24 disks) some of the machines also running p812 hp raid cards (1g=
b cache ) , every hp card has battery unit. Every machine wheter rumning on=
 50gb ram or pools with lots of disks have the same stall problem except on=
e of them which is using hp p6300 san with fc connection . It's running zfs=
 without problems. Do i have to suspect on ciss driver which is common on a=
ll machines where problems occur ? Wheter they use 6400 or p812 raid cards =
all of them  is using same ciss driver except the one connected via fc to s=
an.
>=20
> Btw when zfs stalls after 1-2 minutes later it contiunes to write and rea=
d as usual.=20

Do the stalls get shorter if you decrease kern.cam.da.default_timeout?

Fabian

--Sig_/B2eUcb+W.ufrLFcX9jhdONG
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iEYEARECAAYFAk/rQtMACgkQBYqIVf93VJ0qxgCfXh+ehGM/nNzmQ224Fyw9D30n
fuAAn26ybD5NUIPV21mmUc8jP5p8aBD0
=Mh7+
-----END PGP SIGNATURE-----

--Sig_/B2eUcb+W.ufrLFcX9jhdONG--