From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 3 15:44:06 2014 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B1A2B8BB; Thu, 3 Apr 2014 15:44:06 +0000 (UTC) Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 82055C83; Thu, 3 Apr 2014 15:44:06 +0000 (UTC) Received: from c-24-8-230-52.hsd1.co.comcast.net ([24.8.230.52] helo=damnhippie.dyndns.org) by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1WVjoC-000Keq-Vl; Thu, 03 Apr 2014 15:44:05 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id s33FhvCN087312; Thu, 3 Apr 2014 09:43:57 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 24.8.230.52 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX18YoHskf+m1vB2k/ZfxgUTS Subject: Re: madvise() vs posix_fadvise() From: Ian Lepore To: John Baldwin In-Reply-To: <201404031102.38598.jhb@freebsd.org> References: <201403271141.41487.jhb@freebsd.org> <0AF273E6-CD43-417C-A00C-5B7445090D5B@gmail.com> <201404031102.38598.jhb@freebsd.org> Content-Type: text/plain; charset="koi8-r" Date: Thu, 03 Apr 2014 09:43:57 -0600 Message-ID: <1396539837.81853.278.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by damnhippie.dyndns.org id s33FhvCN087312 Cc: freebsd-hackers@FreeBSD.org, Dmitry Sivachenko , Trond =?ISO-8859-1?Q?Endrest=F8l?= X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Apr 2014 15:44:06 -0000 On Thu, 2014-04-03 at 11:02 -0400, John Baldwin wrote: > On Thursday, April 03, 2014 7:29:03 am Dmitry Sivachenko wrote: > >=20 > > On 27 =CD=C1=D2=D4=C1 2014 =C7., at 19:41, John Baldwin wrote: > > >>=20 > > >> I know about mlock(2), it is a bit overkill. > > >> Can someone please explain the difference between madvise(MADV_WIL= LNEED) and=20 > > > posix_fadvise(POSIX_FADV_WILLNEED)? > > >=20 > > > Right now FADV_WILLNEED is a nop. (I have some patches to implemen= t it for > > > UFS.) I can't recall off the top of my head if MADV_WILLNEED is al= so a nop. > > > However, if both are fully implemented they should be similar in te= rms of > > > requesting async read-ahead. MADV_WILLNEED might also conceivably > > > pre-create PTEs while FADV_WILLNEED can be used on a file that isn'= t > > > mapped but is accessed via read(2). > > >=20 > >=20 > >=20 > > Hello and thanks for your reply. > >=20 > > Right now I am facing the following problem (stable/10): > > There is a (home-grown) webserver which mmap's a large amount of data= files (total size is a bit below of RAM, say ~90GB of files with 128GB o= f RAM). > > Server writes access.log (several gigabytes per day). > >=20 > > Some of mmaped data files are used frequently, some are used rarely. = On startup, server walks through all of these data files so it's content = is read=20 > from disk. > >=20 > > After some time of running, I see that rarely used data files are pur= ged from RAM (access to them leads to long-running disk reads) in favour = of disk=20 > cache > > (at 0:00, when I rotate and gzip log file I see Inactive memory goes = down to the value of log file size). > >=20 > > Is there any way to tell VM system not to push mmap'ed regions out of= RAM in favour of disk caches? >=20 > Use POSIX_FADV_NOREUSE with fadvise() for the log files. They are a pe= rfect > use case for this flag. This will tell the VM system to throw the log = data > (move it to cache) after it writes the file. >=20 > --=20 > John Baldwin Does that work well in the case of something like /var/log/messages that is repeatedly appended-to at random intervals? It would be bad if every new line written to the log triggered a physical read-modify-write. On the other hand if it somehow results in the last / partitial block being the only one likely to stay in memory, that would be perfect. -- Ian