From owner-freebsd-stable@FreeBSD.ORG  Thu Oct 30 04:37:16 2014
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 1DDFCA18
 for <freebsd-stable@freebsd.org>; Thu, 30 Oct 2014 04:37:16 +0000 (UTC)
Received: from mail.akips.com (mail.akips.com [65.19.130.19])
 by mx1.freebsd.org (Postfix) with ESMTP id 08942C49
 for <freebsd-stable@freebsd.org>; Thu, 30 Oct 2014 04:37:15 +0000 (UTC)
Received: from akips.com (CPE-120-146-191-2.static.qld.bigpond.net.au
 [120.146.191.2]) by mail.akips.com (Postfix) with ESMTPSA id 814D517;
 Thu, 30 Oct 2014 14:37:13 +1000 (EST)
Date: Thu, 30 Oct 2014 14:37:05 +1000
From: Paul Koch <paul.koch@akips.com>
To: Ryan Stone <rysto32@gmail.com>
Subject: Re: Suspected kernel memory leak with mmap/sha1 ?
Message-ID: <20141030143705.22a7bd0e@akips.com>
In-Reply-To: <CAFMmRNyBWGNN3279gnhaPOYdf5PsHJf7CdUPKPy96Xz8_svHzw@mail.gmail.com>
References: <20141030100853.65a62326@akips.com>
 <CAFMmRNyBWGNN3279gnhaPOYdf5PsHJf7CdUPKPy96Xz8_svHzw@mail.gmail.com>
Organization: AKIPS
X-Mailer: Claws Mail 3.10.1 (GTK+ 2.24.22; amd64-portbld-freebsd10.0)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY,
 URIBL_BLOCKED autolearn=disabled version=3.4.0
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on host1.akips.com
Cc: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Oct 2014 04:37:16 -0000

On Wed, 29 Oct 2014 22:44:03 -0400
Ryan Stone <rysto32@gmail.com> wrote:

> This is normal behaviour.  Active (and inactive) memory comprises both
> application memory and system disk cache.  By reading in all of those
> files, you have loaded the contents of those files into the cache,
> which is counted as active memory.  The kernel has no preference
> between disk cache or application memory.  Its decisions as to whether
> to swap application data out to disk or discard a page of disk cache
> are mainly based on the VM system's estimates of how often that data
> is being read.
> 
> One test that you could perform is to run your app with swap disabled.
> If your application runs faster with swap disabled, that indicates
> that the VM subsystem's heuristics are suboptimal for your workload
> and some tuning might be necessary.  If your application runs at the
> same speed or slower, then the VM subsystem is making correct (if
> counter-intuitive) decisions performance-wise.  In that case, if your
> performance is still unacceptable then your options are either to tune
> your app's algorithm to reduce its working set (including application
> and file data) to fit into memory, or install more memory in the
> machine.


Ok, we've done a bit more playing.  Appears that if you touch the
mmap'ed pages twice, it triggers the kernel to be much more aggressive
at caching the file in active memory.  This is occurring because we 
always validate the data with SHA1 before uncompressing it.

Because we always read through the mmap'ed data sequentially, adding
madvise() with the MADV_DONTNEED flag after each block stops the kernel
from unnecessarily caching data we don't want.  Also added a madvise()
before munmap().  This has fixed the behaviour of the test program and
also our application.

The default behaviour of the kernel appears to be rather aggressive in
trying to keep the file in active memory, but since it doesn't know what
is cached data and application data, there's probably not much else it
can do ?


--------------------------------


#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include <sha.h>

#define BUFSIZE (1 * 1024 * 1024)

#define TMIN(a,b) ({            \
      typeof (a) _a = (a);      \
      typeof (b) _b = (b);      \
      _a < _b ? _a : _b;        \
})

int
main (int argc, char **argv)
{
   int     i, j,
           fd = -1,
           open_flags = O_RDONLY,
           prot_flags = PROT_READ,
           mmap_flags = MADV_SEQUENTIAL;
   char   *filename,
          *data = NULL,
          *buf = NULL,
          *p;
   size_t  len,
           rlen;
   struct stat s;
   u_char   md[20];
   SHA_CTX  SD;


   if ((buf = malloc ((size_t) BUFSIZE)) == NULL) {
      fprintf (stderr, "malloc: %s\n", strerror (errno));
      goto END;
   }

   for (i = 1; i < argc; i++) {
      filename = argv[i];
      
      if (stat (filename, &s) != 0)
         fprintf (stderr, "stat: %s %s\n", filename, strerror (errno));

      else if ((fd = open (filename, open_flags)) == -1)
         fprintf (stderr, "open: %s %s\n", filename, strerror (errno));

      else if ((data = mmap (NULL, (size_t) s.st_size, prot_flags, 
                mmap_flags, fd, (off_t) 0)) == MAP_FAILED) {
         fprintf (stderr, "mmap: %s %s\n", filename, strerror (errno));
         close (fd);
         fd = -1;
      }

      else {
         printf ("%s: %zd bytes\n", filename, s.st_size);

         p = data;
         len = s.st_size;

         while (len > 0) {
            rlen = TMIN (BUFSIZE, len);

            /* Copy BUFSIZE lumps into buf and modify it so the
             * compiler doesn't optimise it out
             */
            memcpy (buf, p, rlen);
            for (j = 0; j < BUFSIZE; j += 1024)
               buf[j]++;

            /* Doing a second pass triggers the kernel to be 
             * aggressive in its file caching.
             */
            memcpy (buf, p, rlen);
            for (j = 0; j < BUFSIZE; j += 1024)
               buf[j]++;

            /* Tell the kernel we don't need this region anymore */
            madvise (p, rlen, MADV_DONTNEED);

            p += rlen;
            len -= rlen;
         }

         /* Tell the kernel we don't need this file anymore */
         madvise (data, s.st_size, MADV_DONTNEED);

         if (munmap (data, (size_t) s.st_size) == -1) {
            fprintf (stderr, "munmap: %s %s\n", filename, strerror (errno));
            goto END;
         }
         if (close (fd) == -1) {
            fprintf (stderr, "close: %s %s\n", filename, strerror (errno));
            goto END;
         }
      }
   }

END:
   free (buf);

   exit (0);
}


-- 
Paul Koch | Founder, CEO
AKIPS Network Monitor
http://www.akips.com
Brisbane, Australia