From owner-freebsd-current@FreeBSD.ORG Mon Aug 8 21:39:34 2005 Return-Path: X-Original-To: current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 69A0716A41F for ; Mon, 8 Aug 2005 21:39:34 +0000 (GMT) (envelope-from freebsd@lovett.com) Received: from mail.lovett.com (foo.lovett.com [67.134.38.158]) by mx1.FreeBSD.org (Postfix) with ESMTP id 273A643D45 for ; Mon, 8 Aug 2005 21:39:34 +0000 (GMT) (envelope-from freebsd@lovett.com) Received: from hellfire.lovett.com ([67.134.38.149]:58098) by mail.lovett.com with esmtpa (Exim 4.52 (FreeBSD)) id 1E2FLJ-000Omm-VA for current@FreeBSD.org; Mon, 08 Aug 2005 14:39:33 -0700 Message-ID: <42F7D104.2020103@FreeBSD.org> Date: Mon, 08 Aug 2005 14:39:16 -0700 From: Ade Lovett User-Agent: Mozilla Thunderbird 1.0.6 (Macintosh/20050716) X-Accept-Language: en-us, en MIME-Version: 1.0 To: current@FreeBSD.org X-Enigmail-Version: 0.92.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: freebsd@lovett.com X-SA-Exim-Connect-IP: 67.134.38.149 X-SA-Exim-Mail-From: freebsd@lovett.com X-SA-Exim-Scanned: No (on mail.lovett.com); SAEximRunCond expanded to false Cc: Subject: Serious performance issues, broken initialization, and a likely fix X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2005 21:39:34 -0000 Or perhaps it should be just "Here be dragons"... Whilst attempting to nail down some serious performance issues (compared with 4.x) in preparation for a 6.x rollout here, we've come across something of a fundamental bug. In this particular environment (a Usenet transit server, so very high network and disk I/O) we observed that processes were spending a considerable amount of time in state 'wswbuf', traced back to getpbuf() in vm/vm_pager.c To cut a long story short, the order in which nswbuf is being initialized is completely, totally, and utterly wrong -- this was introduced by revision 1.132 of vm/vnode_pager.c just over 4 years ago. In vnode_pager.c we find: static void vnode_pager_init(void) { vnode_pbuf_freecnt = nswbuf / 2 + 1; } Unfortunately, nswbuf hasn't been assigned to yet, just happens to be zero (in all cases), and thus the kernel believes that there is only ever *one* swap buffer available. kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does the calculation and assignment, is called rather further on in the process, by which time the damage has been done. The net result is that *any* calls involving getpbuf() will be unconditionally serialized, completely destroying any kind of concurrency (and performance). Given the memory footprint of our machines, we've hacked in a simple: nswbuf = 0x100; into vnode_pager_init(), since the calculation ends up giving us the maximum number anyway. There are a number of possible 'correct' fixes in terms of re-ordering the startup sequence. With the aforementioned hack, we're now seeing considerably better machine operation, certainly as good as similar 4.10-STABLE boxes. As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD, and should, IMO, be considered an absolutely required fix for 6.0-RELEASE. -aDe