From owner-freebsd-stable@FreeBSD.ORG Thu Oct 30 04:37:16 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1DDFCA18 for ; Thu, 30 Oct 2014 04:37:16 +0000 (UTC) Received: from mail.akips.com (mail.akips.com [65.19.130.19]) by mx1.freebsd.org (Postfix) with ESMTP id 08942C49 for ; Thu, 30 Oct 2014 04:37:15 +0000 (UTC) Received: from akips.com (CPE-120-146-191-2.static.qld.bigpond.net.au [120.146.191.2]) by mail.akips.com (Postfix) with ESMTPSA id 814D517; Thu, 30 Oct 2014 14:37:13 +1000 (EST) Date: Thu, 30 Oct 2014 14:37:05 +1000 From: Paul Koch To: Ryan Stone Subject: Re: Suspected kernel memory leak with mmap/sha1 ? Message-ID: <20141030143705.22a7bd0e@akips.com> In-Reply-To: References: <20141030100853.65a62326@akips.com> Organization: AKIPS X-Mailer: Claws Mail 3.10.1 (GTK+ 2.24.22; amd64-portbld-freebsd10.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY, URIBL_BLOCKED autolearn=disabled version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on host1.akips.com Cc: "freebsd-stable@freebsd.org" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2014 04:37:16 -0000 On Wed, 29 Oct 2014 22:44:03 -0400 Ryan Stone wrote: > This is normal behaviour. Active (and inactive) memory comprises both > application memory and system disk cache. By reading in all of those > files, you have loaded the contents of those files into the cache, > which is counted as active memory. The kernel has no preference > between disk cache or application memory. Its decisions as to whether > to swap application data out to disk or discard a page of disk cache > are mainly based on the VM system's estimates of how often that data > is being read. > > One test that you could perform is to run your app with swap disabled. > If your application runs faster with swap disabled, that indicates > that the VM subsystem's heuristics are suboptimal for your workload > and some tuning might be necessary. If your application runs at the > same speed or slower, then the VM subsystem is making correct (if > counter-intuitive) decisions performance-wise. In that case, if your > performance is still unacceptable then your options are either to tune > your app's algorithm to reduce its working set (including application > and file data) to fit into memory, or install more memory in the > machine. Ok, we've done a bit more playing. Appears that if you touch the mmap'ed pages twice, it triggers the kernel to be much more aggressive at caching the file in active memory. This is occurring because we always validate the data with SHA1 before uncompressing it. Because we always read through the mmap'ed data sequentially, adding madvise() with the MADV_DONTNEED flag after each block stops the kernel from unnecessarily caching data we don't want. Also added a madvise() before munmap(). This has fixed the behaviour of the test program and also our application. The default behaviour of the kernel appears to be rather aggressive in trying to keep the file in active memory, but since it doesn't know what is cached data and application data, there's probably not much else it can do ? -------------------------------- #include #include #include #include #include #include #include #include #include #include #define BUFSIZE (1 * 1024 * 1024) #define TMIN(a,b) ({ \ typeof (a) _a = (a); \ typeof (b) _b = (b); \ _a < _b ? _a : _b; \ }) int main (int argc, char **argv) { int i, j, fd = -1, open_flags = O_RDONLY, prot_flags = PROT_READ, mmap_flags = MADV_SEQUENTIAL; char *filename, *data = NULL, *buf = NULL, *p; size_t len, rlen; struct stat s; u_char md[20]; SHA_CTX SD; if ((buf = malloc ((size_t) BUFSIZE)) == NULL) { fprintf (stderr, "malloc: %s\n", strerror (errno)); goto END; } for (i = 1; i < argc; i++) { filename = argv[i]; if (stat (filename, &s) != 0) fprintf (stderr, "stat: %s %s\n", filename, strerror (errno)); else if ((fd = open (filename, open_flags)) == -1) fprintf (stderr, "open: %s %s\n", filename, strerror (errno)); else if ((data = mmap (NULL, (size_t) s.st_size, prot_flags, mmap_flags, fd, (off_t) 0)) == MAP_FAILED) { fprintf (stderr, "mmap: %s %s\n", filename, strerror (errno)); close (fd); fd = -1; } else { printf ("%s: %zd bytes\n", filename, s.st_size); p = data; len = s.st_size; while (len > 0) { rlen = TMIN (BUFSIZE, len); /* Copy BUFSIZE lumps into buf and modify it so the * compiler doesn't optimise it out */ memcpy (buf, p, rlen); for (j = 0; j < BUFSIZE; j += 1024) buf[j]++; /* Doing a second pass triggers the kernel to be * aggressive in its file caching. */ memcpy (buf, p, rlen); for (j = 0; j < BUFSIZE; j += 1024) buf[j]++; /* Tell the kernel we don't need this region anymore */ madvise (p, rlen, MADV_DONTNEED); p += rlen; len -= rlen; } /* Tell the kernel we don't need this file anymore */ madvise (data, s.st_size, MADV_DONTNEED); if (munmap (data, (size_t) s.st_size) == -1) { fprintf (stderr, "munmap: %s %s\n", filename, strerror (errno)); goto END; } if (close (fd) == -1) { fprintf (stderr, "close: %s %s\n", filename, strerror (errno)); goto END; } } } END: free (buf); exit (0); } -- Paul Koch | Founder, CEO AKIPS Network Monitor http://www.akips.com Brisbane, Australia