From owner-freebsd-current@FreeBSD.ORG  Mon Aug  8 21:39:34 2005
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
X-Original-To: current@FreeBSD.org
Delivered-To: freebsd-current@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 69A0716A41F
	for <current@FreeBSD.org>; Mon,  8 Aug 2005 21:39:34 +0000 (GMT)
	(envelope-from freebsd@lovett.com)
Received: from mail.lovett.com (foo.lovett.com [67.134.38.158])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 273A643D45
	for <current@FreeBSD.org>; Mon,  8 Aug 2005 21:39:34 +0000 (GMT)
	(envelope-from freebsd@lovett.com)
Received: from hellfire.lovett.com ([67.134.38.149]:58098)
	by mail.lovett.com with esmtpa (Exim 4.52 (FreeBSD))
	id 1E2FLJ-000Omm-VA
	for current@FreeBSD.org; Mon, 08 Aug 2005 14:39:33 -0700
Message-ID: <42F7D104.2020103@FreeBSD.org>
Date: Mon, 08 Aug 2005 14:39:16 -0700
From: Ade Lovett <ade@FreeBSD.org>
User-Agent: Mozilla Thunderbird 1.0.6 (Macintosh/20050716)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: current@FreeBSD.org
X-Enigmail-Version: 0.92.0.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Sender: freebsd@lovett.com
X-SA-Exim-Connect-IP: 67.134.38.149
X-SA-Exim-Mail-From: freebsd@lovett.com
X-SA-Exim-Scanned: No (on mail.lovett.com); SAEximRunCond expanded to false
Cc: 
Subject: Serious performance issues, broken initialization, and a likely fix
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Aug 2005 21:39:34 -0000

Or perhaps it should be just "Here be dragons"...

Whilst attempting to nail down some serious performance issues (compared
with 4.x) in preparation for a 6.x rollout here, we've come across
something of a fundamental bug.

In this particular environment (a Usenet transit server, so very high
network and disk I/O) we observed that processes were spending a
considerable amount of time in state 'wswbuf', traced back to getpbuf()
in vm/vm_pager.c

To cut a long story short, the order in which nswbuf is being
initialized is completely, totally, and utterly wrong -- this was
introduced by revision 1.132 of vm/vnode_pager.c just over 4 years ago.

In vnode_pager.c we find:

static void
vnode_pager_init(void)
{
	vnode_pbuf_freecnt = nswbuf / 2 + 1;
}

Unfortunately, nswbuf hasn't been assigned to yet, just happens to be
zero (in all cases), and thus the kernel believes that there is only
ever *one* swap buffer available.

kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does the
calculation and assignment, is called rather further on in the process,
by which time the damage has been done.

The net result is that *any* calls involving getpbuf() will be
unconditionally serialized, completely destroying any kind of
concurrency (and performance).

Given the memory footprint of our machines, we've hacked in a simple:

	nswbuf = 0x100;

into vnode_pager_init(), since the calculation ends up giving us the
maximum number anyway.  There are a number of possible 'correct' fixes
in terms of re-ordering the startup sequence.

With the aforementioned hack, we're now seeing considerably better
machine operation, certainly as good as similar 4.10-STABLE boxes.

As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD, and
should, IMO, be considered an absolutely required fix for 6.0-RELEASE.

-aDe