From owner-freebsd-fs@FreeBSD.ORG Tue Mar 6 19:51:59 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B2A9E106564A; Tue, 6 Mar 2012 19:51:59 +0000 (UTC) (envelope-from luke@hybrid-logic.co.uk) Received: from hybrid-sites.com (ns226322.hybrid-sites.com [176.31.229.137]) by mx1.freebsd.org (Postfix) with ESMTP id 655098FC15; Tue, 6 Mar 2012 19:51:58 +0000 (UTC) Received: from [127.0.0.1] (helo=ewes) by hybrid-sites.com with esmtp (Exim 4.72 (FreeBSD)) (envelope-from ) id 1S4zpG-0007hn-Pt; Tue, 06 Mar 2012 19:13:36 +0000 Received: from [176.31.225.127] (helo=ewes by ns226322.hybrid-sites.com with esmtp (Hybrid Web Cluster distributed mail proxy) (envelope-from ); Tue, 06 Mar 2012 19:13:34 -0000 Received: from [193.37.225.212] (helo=[10.0.126.148] by ns225413.hybrid-sites.com with esmtp (Hybrid Web Cluster distributed mail proxy) (envelope-from ); Tue, 06 Mar 2012 19:13:34 -0000 From: Luke Marsden To: freebsd-stable@freebsd.org, freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" Date: Tue, 06 Mar 2012 19:13:23 +0000 Message-ID: <1331061203.2218.38.camel@pow> Mime-Version: 1.0 X-Mailer: Evolution 2.32.2 Content-Transfer-Encoding: 7bit X-Spam-bar: + Cc: team@hybrid-logic.co.uk Subject: FreeBSD 8.2 - active plus inactive memory leak!? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Mar 2012 19:51:59 -0000 Hi all, I'm having some trouble with some production 8.2-RELEASE servers where the 'Active' and 'Inact' memory values reported by top don't seem to correspond with the processes which are running on the machine. I have two near-identical machines (with slightly different workloads); on one, let's call it A, active + free is small (6.5G) and on the other (B) active + free is large (13.6G), even though they have almost identical sums-of-resident memory (8.3G on A and 9.3G on B). The only difference is that A has a smaller number of quite long-running processes (it's hosting a small number of busy sites) and B has a larger number of more frequently killed/recycled processes (it's hosting a larger number of quiet sites, so the FastCGI processes get killed and restarted frequently). Notably B has many more ZFS filesystems mounted than A (around 4,000 versus 100). The machines are otherwise under similar amounts of load. I hoped that the community could please help me understand what's going on with respect to the worryingly large amount of active + free memory on B. Both machines are ZFS-on-root with FreeBSD 8.2-RELEASE with uptimes around 5-6 days. I have recently reduced the ARC cache on both machines since my previous thread [1] and Wired memory usage is now stable at 6G on A and 7G on B with an arc_max of 4G on both machines. Neither of the machines have any swap in use: Swap: 10G Total, 10G Free My current (probably quite simplistic) understanding of the FreeBSD virtual memory system is that, for each process as reported by top: * Size corresponds to the total size of all the text pages for the process (those belonging to code in the binary itself and linked libraries) plus data pages (including stack and malloc()'d but not-yet-written-to memory segments). * Resident corresponds to a subset of the pages above: those pages which actually occupy physical/core memory. Notably pages may appear in size but not appear in resident for read-only text pages from libraries which have not been used yet or which have been malloc()'d but not yet written-to. My understanding for the values for the system as a whole (at the top in 'top') is as follows: * Active / inactive memory is the same thing: resident memory from processes in use. Being in the inactive as opposed to active list simply indicates that the pages in question are less recently used and therefore more likely to get swapped out if the machine comes under memory pressure. * Wired is mostly kernel memory. * Cache is freed memory which the kernel has decided to keep in case it correspond to a useful page in future; it can be cheaply evicted into the free list. * Free memory is actually not being used for anything. It seems that pages which occur in the active + inactive lists must occur in the resident memory of one or more processes ("or more" since processes can share pages in e.g. read-only shared libs or COW forked address space). Conversely, if a page *does not* occur in the resident memory of any process, it must not occupy any space in the active + inactive lists. Therefore the active + inactive memory should always be less than or equal to the sum of the resident memory of all the processes on the system, right? But it's not. So, I wrote a very simple Python script to add up the resident memory values in the output from 'top' and, on machine A: Mem: 3388M Active, 3209M Inact, 6066M Wired, 196K Cache, 11G Free There were 246 processes totalling 8271 MB resident memory Whereas on machine B: Mem: 11G Active, 2598M Inact, 7177M Wired, 733M Cache, 1619M Free There were 441 processes totalling 9297 MB resident memory Now, on machine A: 3388M active + 3209M inactive - 8271M sum-of-resident = -1674M I can attribute this negative value to shared libraries between the running processes (which the sum-of-res is double-counting but active + inactive is not). But on machine B: 11264M active + 2598M inactive - 9297M sum-of-resident = 4565M I'm struggling to explain how, when there are only 9.2G (worst case, discounting shared pages) of resident processes, the system is using 11G + 2598M = 13.8G of memory! This "missing memory" is scary, because it seems to be increasing over time, and eventually when the system runs out of free memory, I'm certain it will crash in the same way described in my previous thread [1]. Is my understanding of the virtual memory system badly broken - in which case please educate me ;-) or is there a real problem here? If so how can I dig deeper to help uncover/fix it? Best Regards, Luke Marsden [1] lists.freebsd.org/pipermail/freebsd-fs/2012-February/013775.html [2] https://gist.github.com/1988153 -- CTO, Hybrid Logic +447791750420 | +1-415-449-1165 | www.hybrid-cluster.com