Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Jan 2012 23:45:27 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Don Lewis <truckman@freebsd.org>
Cc:        mckusick@mckusick.com, flo@freebsd.org, current@freebsd.org, attilio@freebsd.org, phk@phk.freebsd.dk, kib@freebsd.org, seanbru@yahoo-inc.com
Subject:   Re: dogfooding over in clusteradm land
Message-ID:  <20120102214527.GJ50300@deviant.kiev.zoral.com.ua>
In-Reply-To: <201201022047.q02Kl3IM005792@gw.catspoiler.org>
References:  <4F01F8FD.4020901@FreeBSD.org> <201201022047.q02Kl3IM005792@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--KV5J30b1nBuIoEpe
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Jan 02, 2012 at 12:47:03PM -0800, Don Lewis wrote:
> On  2 Jan, Florian Smeets wrote:
> > On 29.12.11 01:04, Kirk McKusick wrote:
> >> Rather than changing BKVASIZE, I would try running the cvs2svn
> >> conversion on a 16K/2K filesystem and see if that sorts out the
> >> problem. If it does, it tells us that doubling the main block
> >> size and reducing the number of buffers by half is the problem.
> >> If that is the problem, then we will have to increase the KVM
> >> allocated to the buffer cache.
> >>=20
> >=20
> > This does not make a difference. I tried on 32K/4K with/without journal
> > and on 16K/2K all exhibit the same problem. At some point during the
> > cvs2svn conversion the sycer starts to use 100% CPU. The whole process
> > hangs at that point sometimes for hours, from time to time it does
> > continue doing some work, but really really slow. It's usually between
> > revision 210000 and 220000, when the resulting svn file gets bigger than
> > about 11-12Gb. At that point an ls in the target dir hangs in state ufs.
> >=20
> > I broke into ddb and ran all commands which i thought could be useful.
> > The output is at http://tb.smeets.im/~flo/giant-ape_syncer.txt
>=20
> Tracing command syncer pid 9 tid 100183 td 0xfffffe00120e9000
> cpustop_handler() at cpustop_handler+0x2b
> ipi_nmi_handler() at ipi_nmi_handler+0x50
> trap() at trap+0x1a8
> nmi_calltrap() at nmi_calltrap+0x8
> --- trap 0x13, rip =3D 0xffffffff8082ba43, rsp =3D 0xffffff8000270fe0, rb=
p =3D 0xffffff88c97829a0 ---
> _mtx_assert() at _mtx_assert+0x13
> pmap_remove_write() at pmap_remove_write+0x38
> vm_object_page_remove_write() at vm_object_page_remove_write+0x1f
> vm_object_page_clean() at vm_object_page_clean+0x14d
> vfs_msync() at vfs_msync+0xf1
> sync_fsync() at sync_fsync+0x12a
> sync_vnode() at sync_vnode+0x157
> sched_sync() at sched_sync+0x1d1
> fork_exit() at fork_exit+0x135
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip =3D 0, rsp =3D 0xffffff88c9782d00, rbp =3D 0 ---
>=20
> I thinks this explains why the r228838 patch seems to help the problem.
> Instead of an application call to msync(), you're getting bitten by the
> syncer doing the equivalent.  I don't know why the syncer is CPU bound,
> though.  From my understanding of the patch it only optimizes the I/O.
> Without the patch, I would expect that the syncer would just spend a lot
> of time waiting on I/O.  My guess is that this is actually a vm problem.
> There are nested loops in vm_object_page_clean() and
> vm_object_page_remove_write(), so you could be doing something that's
> causing lots of looping in that code.
r228838 allows the system to skip 50-70% of the code when initiating a
write of the UFS file page, due to async clustering. The system has to
maintain 75% less amount of writes in progress.

> I think that ls is hanging because it's stumbling across the vnode that
> the syncer has locked.
This is the only reasonable explanation.

Low-tech profile is to periodically break out into ddb and do backtrace
for the syncer thread. More advanced techniques is to use dtrace or normal
profiling.

--KV5J30b1nBuIoEpe
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)

iEYEARECAAYFAk8CJXYACgkQC3+MBN1Mb4hAhwCgnXR4RBhr8tclLUzeF3NYg/OX
PRkAnjqHmH2duLg7tqvS/llmmjzaI2nb
=VUIS
-----END PGP SIGNATURE-----

--KV5J30b1nBuIoEpe--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120102214527.GJ50300>