Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Jun 2012 20:02:17 +0300
From:      Levent Serinol <lserinol@gmail.com>
To:        Andreas Nilsson <andrnils@gmail.com>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: ZFS stalls on Heavy I/O
Message-ID:  <DD521FBB-CFAD-4D51-8D0A-D21240FB30FE@gmail.com>
In-Reply-To: <CAPS9%2BSs1oCh=Szhf_qCam85hPh%2BMFu-XRjTEJZT5hYt12qMhXw@mail.gmail.com>
References:  <CACqg54Si-vHFAVkjpTS40MZt4E1Kn14kDUFKmVb8vx449fCnFw@mail.gmail.com> <CAMXYB4K%2B9EKPyqdCRZZgLvDQuwK=AAGSZi8%2B-AfkOrnJQzwdUA@mail.gmail.com> <CAPS9%2BSs1oCh=Szhf_qCam85hPh%2BMFu-XRjTEJZT5hYt12qMhXw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help


Hi,

On 27 Haz 2012, at 19:34, Andreas Nilsson <andrnils@gmail.com> wrote:

>=20
>=20
> On Wed, Jun 27, 2012 at 5:50 PM, Dean Jones <dean.jones@oregonstate.edu> w=
rote:
> On Wed, Jun 27, 2012 at 2:15 AM, Levent Serinol <lserinol@gmail.com> wrote=
:
> > Hi,
> >
> >  Under heavy I/O load we face freeze problems on ZFS volumes on both
> > Freebsd 9 Release and 10 Current versions. Machines are HP servers (64bi=
t)
> > with HP Smart array 6400 raid controllers (with battery units). Every da=

> > device is a hardware raid5 where each one includes 9x300GB 10K SCSI hard=

> > drivers. Main of I/O pattern happens on local system except some small N=
FS
> > I/O from some other servers (NFS lookup/getattr/ etc.). These servers ar=
e
> > mail servers (qmail) with small I/O patterns (64K Read/Write).  Below yo=
u
> > can  find procstat output on freeze time. write_limit is set to 200MB
> > because of the huge amount of txg_wait_opens observed before. Every proc=
ess
> > stops on D state I think due to txg queue and other 2 queues are full. I=
s
> > there any suggestion to fix the problem ?
> >
> > btw inject_compress is the main process injecting emails to user inboxes=

> > (databases). Also, those machines were running without problems on
> > Linux/XFS filesystem. For a while ago, we started  migration from Linux t=
o
> > Freebsd
> >
> >
> > http://pastebin.com/raw.php?i=3Dic3YepWQ
> > _______________________________________________
>=20
> Looks like you are running dedup with only 12 gigs of ram?
>=20
> Dedup is very ram hungry and the dedup tables are probably no longer
> fitting entirely in memory and therefore the system is swapping and
> thrashing about during writes.
>=20
> Also ZFS really prefers to directly address drives instead of RAID
> controllers.  It can not guarantee or know what the controller is
> doing behind the scenes.
> You might want to read http://constantin.glez.de/blog/2011/07/zfs-dedupe-o=
r-not-dedupe and see if you need more ram.
>=20
> And yes, having raid below zfs somewhat defeats the point of zfs.
>=20
> Regards
> Andreas

That was one the machines, i'm running several similar machines except few c=
hanges. For examplw some of them have 50gb and 20gbs of ram and some of them=
 has direct access every disk itself on poil as you suggested ( pools includ=
ing 24 disks) some of the machines also running p812 hp raid cards (1gb cach=
e ) , every hp card has battery unit. Every machine wheter rumning on 50gb r=
am or pools with lots of disks have the same stall problem except one of the=
m which is using hp p6300 san with fc connection . It's running zfs without p=
roblems. Do i have to suspect on ciss driver which is common on all machines=
 where problems occur ? Wheter they use 6400 or p812 raid cards all of them =
 is using same ciss driver except the one connected via fc to san.

Btw when zfs stalls after 1-2 minutes later it contiunes to write and read a=
s usual.=20

Do you suspect any problem in procstat  ouput that i provided ?

Thanks,
Levent=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DD521FBB-CFAD-4D51-8D0A-D21240FB30FE>