Date: Tue, 9 Aug 2005 02:48:13 +0000 From: Alexander Kabaev <kan@freebsd.org> To: Scott Long <scottl@samsco.org> Cc: current@freebsd.org, Ade Lovett <ade@freebsd.org> Subject: Re: Serious performance issues, broken initialization, and a likely fix Message-ID: <20050809024813.GA24768@freefall.freebsd.org> In-Reply-To: <42F80A41.8050901@samsco.org> References: <42F7D104.2020103@FreeBSD.org> <42F80A41.8050901@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 08 Aug 2005 19:43:29 -0600 Scott Long <scottl@samsco.org> wrote: > Ade Lovett wrote: > > Or perhaps it should be just "Here be dragons"... > >=20 > > Whilst attempting to nail down some serious performance issues > > (compared with 4.x) in preparation for a 6.x rollout here, we've > > come across something of a fundamental bug. > >=20 > > In this particular environment (a Usenet transit server, so very > > high network and disk I/O) we observed that processes were spending > > a considerable amount of time in state 'wswbuf', traced back to > > getpbuf() in vm/vm_pager.c > >=20 > > To cut a long story short, the order in which nswbuf is being > > initialized is completely, totally, and utterly wrong -- this was > > introduced by revision 1.132 of vm/vnode_pager.c just over 4 years > > ago. > >=20 > > In vnode_pager.c we find: > >=20 > > static void > > vnode_pager_init(void) > > { > > vnode_pbuf_freecnt =3D nswbuf / 2 + 1; > > } > >=20 > > Unfortunately, nswbuf hasn't been assigned to yet, just happens to > > be zero (in all cases), and thus the kernel believes that there is > > only ever *one* swap buffer available. > >=20 > > kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does > > the calculation and assignment, is called rather further on in the > > process, by which time the damage has been done. > >=20 > > The net result is that *any* calls involving getpbuf() will be > > unconditionally serialized, completely destroying any kind of > > concurrency (and performance). > >=20 > > Given the memory footprint of our machines, we've hacked in a > > simple: > >=20 > > nswbuf =3D 0x100; > >=20 > > into vnode_pager_init(), since the calculation ends up giving us the > > maximum number anyway. There are a number of possible 'correct' > > fixes in terms of re-ordering the startup sequence. > >=20 > > With the aforementioned hack, we're now seeing considerably better > > machine operation, certainly as good as similar 4.10-STABLE boxes. > >=20 > > As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD, > > and should, IMO, be considered an absolutely required fix for > > 6.0-RELEASE. > >=20 > > -aDe > >=20 >=20 > My vote is to revert rev 1.132 and replace the XXX comment with a more > detailed explaination of the perils involved. Do you have any kind of > easy to run regression test that could be used to quantify this > problem and guard against it in the future? Thanks very very much > for looking into it and providing such a good explaination. >=20 > Scott > _______________________________________________ I experimented with calling vm_pager_init at vm_pager_bufferinit time instead of calling it as last thing in vm_mem_init and my test system runs with no (visible) ill effects. I wonder if we can collapse vm_pager_init and vm_pager_bufinit into single function and get rid of pages initialization at SI_SUB_VM time. I guess now would be a good time to ask our VM know-how holders.=20 I do support reverting rev1.132 of vm_pager.c in RELENG_5 and RELENG_6 as a more conservative and safe choice though. --=20 Alexander Kabaev
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050809024813.GA24768>