Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Aug 2005 02:48:13 +0000
From:      Alexander Kabaev <kan@freebsd.org>
To:        Scott Long <scottl@samsco.org>
Cc:        current@freebsd.org, Ade Lovett <ade@freebsd.org>
Subject:   Re: Serious performance issues, broken initialization, and a likely fix
Message-ID:  <20050809024813.GA24768@freefall.freebsd.org>
In-Reply-To: <42F80A41.8050901@samsco.org>
References:  <42F7D104.2020103@FreeBSD.org> <42F80A41.8050901@samsco.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 08 Aug 2005 19:43:29 -0600
Scott Long <scottl@samsco.org> wrote:

> Ade Lovett wrote:
> > Or perhaps it should be just "Here be dragons"...
> >=20
> > Whilst attempting to nail down some serious performance issues
> > (compared with 4.x) in preparation for a 6.x rollout here, we've
> > come across something of a fundamental bug.
> >=20
> > In this particular environment (a Usenet transit server, so very
> > high network and disk I/O) we observed that processes were spending
> > a considerable amount of time in state 'wswbuf', traced back to
> > getpbuf() in vm/vm_pager.c
> >=20
> > To cut a long story short, the order in which nswbuf is being
> > initialized is completely, totally, and utterly wrong -- this was
> > introduced by revision 1.132 of vm/vnode_pager.c just over 4 years
> > ago.
> >=20
> > In vnode_pager.c we find:
> >=20
> > static void
> > vnode_pager_init(void)
> > {
> > 	vnode_pbuf_freecnt =3D nswbuf / 2 + 1;
> > }
> >=20
> > Unfortunately, nswbuf hasn't been assigned to yet, just happens to
> > be zero (in all cases), and thus the kernel believes that there is
> > only ever *one* swap buffer available.
> >=20
> > kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does
> > the calculation and assignment, is called rather further on in the
> > process, by which time the damage has been done.
> >=20
> > The net result is that *any* calls involving getpbuf() will be
> > unconditionally serialized, completely destroying any kind of
> > concurrency (and performance).
> >=20
> > Given the memory footprint of our machines, we've hacked in a
> > simple:
> >=20
> > 	nswbuf =3D 0x100;
> >=20
> > into vnode_pager_init(), since the calculation ends up giving us the
> > maximum number anyway.  There are a number of possible 'correct'
> > fixes in terms of re-ordering the startup sequence.
> >=20
> > With the aforementioned hack, we're now seeing considerably better
> > machine operation, certainly as good as similar 4.10-STABLE boxes.
> >=20
> > As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD,
> > and should, IMO, be considered an absolutely required fix for
> > 6.0-RELEASE.
> >=20
> > -aDe
> >=20
>=20
> My vote is to revert rev 1.132 and replace the XXX comment with a more
> detailed explaination of the perils involved.  Do you have any kind of
> easy to run regression test that could be used to quantify this
> problem and guard against it in the future?  Thanks very very much
> for looking into it and providing such a good explaination.
>=20
> Scott
> _______________________________________________

I experimented with calling vm_pager_init at vm_pager_bufferinit time
instead of calling it as last thing in vm_mem_init and my test system
runs with no (visible) ill effects. I wonder if we can collapse
vm_pager_init and vm_pager_bufinit into single function and get rid of
pages initialization at SI_SUB_VM time. I guess now would be a good
time to ask our VM know-how holders.=20

I do support reverting rev1.132 of vm_pager.c in RELENG_5 and RELENG_6
as a more conservative and safe choice though.
--=20
Alexander Kabaev



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050809024813.GA24768>