From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 4 17:52:16 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BAB40240; Fri, 4 Apr 2014 17:52:16 +0000 (UTC) Received: from mail-lb0-x22c.google.com (mail-lb0-x22c.google.com [IPv6:2a00:1450:4010:c04::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 14033C92; Fri, 4 Apr 2014 17:52:15 +0000 (UTC) Received: by mail-lb0-f172.google.com with SMTP id c11so2760362lbj.31 for ; Fri, 04 Apr 2014 10:52:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=H9XQrd/W2x1+v9mNp8PRJkezMRdNi2kYIDds4lq6TB8=; b=n2MpvCZrIGRgIys8TP8Ka7PCPwooKmlOxlLVdD0vvW+fl/3kYBUL4dxu1aQ1574wC3 ++XNIjIJXrWXMg4VR6+QxjjSOJiYMJvqC1lMMP0S1GhRC6ecTom8gmBTa4Alq6LnUMPC QfIRjBzAulcuF/5Gs9mdeDLdYnkrKJNM4yU1g3poYqIb7AEFyid55FlQl7l3MjZhjhkn VmMGYW+2xW2i5v6bwjlrzK3r8Qti1yQjB7d/POyjbluhjGyAmwNK5ka/nlTMSKHMdl0Y /vBPkAmlYCrUrwtUOFNzRgd+TLeOte2vN9bUZwjM02isboFQFER1j/2SMSeEuoSwo6rY wHtQ== X-Received: by 10.152.36.199 with SMTP id s7mr2156751laj.48.1396633933774; Fri, 04 Apr 2014 10:52:13 -0700 (PDT) Received: from [10.0.1.9] (ip-95-220-108-153.bb.netbynet.ru. [95.220.108.153]) by mx.google.com with ESMTPSA id rd5sm6104527lbb.0.2014.04.04.10.52.11 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 04 Apr 2014 10:52:11 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: madvise() vs posix_fadvise() From: Dmitry Sivachenko In-Reply-To: <201404031102.38598.jhb@freebsd.org> Date: Fri, 4 Apr 2014 21:52:09 +0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <201403271141.41487.jhb@freebsd.org> <0AF273E6-CD43-417C-A00C-5B7445090D5B@gmail.com> <201404031102.38598.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1874) Cc: freebsd-hackers@freebsd.org, =?utf-8?Q?Trond_Endrest=C3=B8l?= X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Apr 2014 17:52:16 -0000 On 03 =D0=B0=D0=BF=D1=80. 2014 =D0=B3., at 19:02, John Baldwin = wrote: >>=20 >> Right now I am facing the following problem (stable/10): >> There is a (home-grown) webserver which mmap's a large amount of data = files (total size is a bit below of RAM, say ~90GB of files with 128GB = of RAM). >> Server writes access.log (several gigabytes per day). >>=20 >> Some of mmaped data files are used frequently, some are used rarely. = On startup, server walks through all of these data files so it's content = is read=20 > from disk. >>=20 >> After some time of running, I see that rarely used data files are = purged from RAM (access to them leads to long-running disk reads) in = favour of disk=20 > cache >> (at 0:00, when I rotate and gzip log file I see Inactive memory goes = down to the value of log file size). >>=20 >> Is there any way to tell VM system not to push mmap'ed regions out of = RAM in favour of disk caches? >=20 > Use POSIX_FADV_NOREUSE with fadvise() for the log files. They are a = perfect > use case for this flag. This will tell the VM system to throw the log = data > (move it to cache) after it writes the file. Another question is why madvise(MADV_WILLNEED) is not enough to prefer = keeping mmap'ed data in memory instead of dedicating all memory to cache = log files? Even if that mmap'ed memory is rarely used. While POSIX_FADV_NOREUSE might be a solution for some cases (I am = already testing it), it needs to be implemented in many programs (all = that read/write files on disk), while madvise(MADV_WILLNEED) sounds like a proper solution to increase = priority for mmaped region regardless of what other programs use disk = but it does not seem to work as expected.=