From owner-freebsd-stable@FreeBSD.ORG  Wed Mar  7 00:48:57 2012
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 9B26C1065676
	for <freebsd-stable@freebsd.org>; Wed,  7 Mar 2012 00:48:57 +0000 (UTC)
	(envelope-from freebsd@damnhippie.dyndns.org)
Received: from qmta08.emeryville.ca.mail.comcast.net
	(qmta08.emeryville.ca.mail.comcast.net [76.96.30.80])
	by mx1.freebsd.org (Postfix) with ESMTP id 7F7108FC08
	for <freebsd-stable@freebsd.org>; Wed,  7 Mar 2012 00:48:57 +0000 (UTC)
Received: from omta22.emeryville.ca.mail.comcast.net ([76.96.30.89])
	by qmta08.emeryville.ca.mail.comcast.net with comcast
	id iQgw1i0081vN32cA8Qorur; Wed, 07 Mar 2012 00:48:51 +0000
Received: from damnhippie.dyndns.org ([24.8.232.202])
	by omta22.emeryville.ca.mail.comcast.net with comcast
	id iQoq1i0084NgCEG8iQoq9H; Wed, 07 Mar 2012 00:48:51 +0000
Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240])
	by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id q270mlJM015471; 
	Tue, 6 Mar 2012 17:48:48 -0700 (MST)
	(envelope-from freebsd@damnhippie.dyndns.org)
From: Ian Lepore <freebsd@damnhippie.dyndns.org>
To: Luke Marsden <luke@hybrid-logic.co.uk>
In-Reply-To: <1331061203.2218.38.camel@pow>
References: <1331061203.2218.38.camel@pow>
Content-Type: text/plain; charset="us-ascii"
Date: Tue, 06 Mar 2012 17:48:47 -0700
Message-ID: <1331081327.32194.19.camel@revolution.hippie.lan>
Mime-Version: 1.0
X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port 
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, team@hybrid-logic.co.uk, freebsd-stable@freebsd.org
Subject: Re: FreeBSD 8.2 - active plus inactive memory leak!?
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Mar 2012 00:48:57 -0000

On Tue, 2012-03-06 at 19:13 +0000, Luke Marsden wrote:
> Hi all,
> 
> I'm having some trouble with some production 8.2-RELEASE servers where
> the 'Active' and 'Inact' memory values reported by top don't seem to
> correspond with the processes which are running on the machine.  I have
> two near-identical machines (with slightly different workloads); on one,
> let's call it A, active + free is small (6.5G) and on the other (B)
> active + free is large (13.6G), even though they have almost identical
> sums-of-resident memory (8.3G on A and 9.3G on B).
> 
> The only difference is that A has a smaller number of quite long-running
> processes (it's hosting a small number of busy sites) and B has a larger
> number of more frequently killed/recycled processes (it's hosting a
> larger number of quiet sites, so the FastCGI processes get killed and
> restarted frequently).  Notably B has many more ZFS filesystems mounted
> than A (around 4,000 versus 100).  The machines are otherwise under
> similar amounts of load.  I hoped that the community could please help
> me understand what's going on with respect to the worryingly large
> amount of active + free memory on B.
> 
> Both machines are ZFS-on-root with FreeBSD 8.2-RELEASE with uptimes
> around 5-6 days.  I have recently reduced the ARC cache on both machines
> since my previous thread [1] and Wired memory usage is now stable at 6G
> on A and 7G on B with an arc_max of 4G on both machines.
> 
> Neither of the machines have any swap in use:
> 
>         Swap: 10G Total, 10G Free
> 
> My current (probably quite simplistic) understanding of the FreeBSD
> virtual memory system is that, for each process as reported by top:
> 
>       * Size corresponds to the total size of all the text pages for the
>         process (those belonging to code in the binary itself and linked
>         libraries) plus data pages (including stack and malloc()'d but
>         not-yet-written-to memory segments).
>       * Resident corresponds to a subset of the pages above: those pages
>         which actually occupy physical/core memory.  Notably pages may
>         appear in size but not appear in resident for read-only text
>         pages from libraries which have not been used yet or which have
>         been malloc()'d but not yet written-to.
> 
> My understanding for the values for the system as a whole (at the top in
> 'top') is as follows:
> 
>       * Active / inactive memory is the same thing: resident memory from
>         processes in use.  Being in the inactive as opposed to active
>         list simply indicates that the pages in question are less
>         recently used and therefore more likely to get swapped out if
>         the machine comes under memory pressure.
>       * Wired is mostly kernel memory.
>       * Cache is freed memory which the kernel has decided to keep in
>         case it correspond to a useful page in future; it can be cheaply
>         evicted into the free list.
>       * Free memory is actually not being used for anything.
> 
> It seems that pages which occur in the active + inactive lists must
> occur in the resident memory of one or more processes ("or more" since
> processes can share pages in e.g. read-only shared libs or COW forked
> address space).  Conversely, if a page *does not* occur in the resident
> memory of any process, it must not occupy any space in the active +
> inactive lists.
> 
> Therefore the active + inactive memory should always be less than or
> equal to the sum of the resident memory of all the processes on the
> system, right?
> 
> But it's not.  So, I wrote a very simple Python script to add up the
> resident memory values in the output from 'top' and, on machine A:
> 
>         Mem: 3388M Active, 3209M Inact, 6066M Wired, 196K Cache, 11G
>         Free
>         There were 246 processes totalling 8271 MB resident memory
>         
> Whereas on machine B:
> 
>         Mem: 11G Active, 2598M Inact, 7177M Wired, 733M Cache, 1619M
>         Free
>         There were 441 processes totalling 9297 MB resident memory
>         
> Now, on machine A:
> 
>         3388M active + 3209M inactive - 8271M sum-of-resident = -1674M
>         
> I can attribute this negative value to shared libraries between the
> running processes (which the sum-of-res is double-counting but active +
> inactive is not).  But on machine B:
> 
>         11264M active + 2598M inactive - 9297M sum-of-resident = 4565M
>         
> I'm struggling to explain how, when there are only 9.2G (worst case,
> discounting shared pages) of resident processes, the system is using 11G
> + 2598M = 13.8G of memory!
> 
> This "missing memory" is scary, because it seems to be increasing over
> time, and eventually when the system runs out of free memory, I'm
> certain it will crash in the same way described in my previous thread
> [1].
> 
> Is my understanding of the virtual memory system badly broken - in which
> case please educate me ;-) or is there a real problem here?  If so how
> can I dig deeper to help uncover/fix it?
> 
> Best Regards,
> Luke Marsden
> 
> [1] lists.freebsd.org/pipermail/freebsd-fs/2012-February/013775.html
> [2] https://gist.github.com/1988153
> 

In my experience, the bulk of the memory in the inactive category is
cached disk blocks, at least for ufs (I think zfs does things
differently).  On this desktop machine I have 12G physical and typically
have roughly 11G inactive, and I can unmount one particular filesystem
where most of my work is done and instantly I have almost no inactive
and roughly 11G free.

-- Ian