From owner-freebsd-current@FreeBSD.ORG Sun Apr 20 11:28:49 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6626537B401; Sun, 20 Apr 2003 11:28:49 -0700 (PDT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id C66C343FDD; Sun, 20 Apr 2003 11:28:48 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.9/8.12.6) with ESMTP id h3KISmVI090100; Sun, 20 Apr 2003 11:28:48 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9/8.12.6/Submit) id h3KISlKq090099; Sun, 20 Apr 2003 11:28:47 -0700 (PDT) Date: Sun, 20 Apr 2003 11:28:47 -0700 (PDT) From: Matthew Dillon Message-Id: <200304201828.h3KISlKq090099@apollo.backplane.com> To: David Schultz References: <000501c30682$4e5e64b0$6601a8c0@VAIO650> <20030420002940.GB46590@HAL9000.homeunix.com> <20030420101401.GA2821@HAL9000.homeunix.com> cc: freebsd-current@freebsd.org Subject: Re: Broken memory management on system with no swap X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Apr 2003 18:28:49 -0000 Hmm. It sounds like this program is using mmap() to dirty pages and that the VM system is not flushing them out quickly enough to avoid running out of memory. This could happen if the program dirties a significant portion of memory all at once. The pageout daemon would wind up doing a priority requeue of the dirty pages (line 848 of vm_pageout.c) and 'miss' flushing any of them out to the filesystem in the first pass. The result would be that the system would believe it has run out of memory for a short period of time. We don't want to disable the requeue code, because doing so would destroy pageout performance for any other case. The 'pass' variable was supposed to deal with this case (by forcing the laundering if the system is unable to find enough pages to free the first time), see line 848 again. But in looking at the code the big-process-kill sequence is still run in the first pass (pass == 0), and it probably shouldn't be run until the second pass (pass != 0). I suggest changing this: if ((vm_swap_size < 64 && vm_page_count_min()) || (swap_pager_full && vm_paging_target() > 0)) { To this: if (pass != 0 && ((vm_swap_size < 64 && vm_page_count_min()) || (swap_pager_full && vm_paging_target() > 0))) { Assuming this fixes the problem I would request that it be tested in true out-of-memory situations to ensure that the big-process-kill code still works properly before comitting it. -Matt