From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 9 11:04:28 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8939CF98; Wed, 9 Apr 2014 11:04:28 +0000 (UTC) Received: from mail-lb0-x230.google.com (mail-lb0-x230.google.com [IPv6:2a00:1450:4010:c04::230]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AB4351631; Wed, 9 Apr 2014 11:04:27 +0000 (UTC) Received: by mail-lb0-f176.google.com with SMTP id 10so985851lbg.21 for ; Wed, 09 Apr 2014 04:04:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=L3VhQ8HFX1DpZBQYeqxwgqQmu4qpd483wNus0fawOzo=; b=uVrogvsqcpF0l/SUobdfePsZiqgeDduzc4mVihK+Th69jNbWzS5eIIPOjDPhNrD+8u J2abE3I1o52/P/XGmcpug4um/aUSVZ1PMsdp2dEjjJmI6fCO3lFu+8E5aZ2Ghz7HwvHN PQksDIy/4YMRRZUxCYcl1+FZ/ou0E0Ua6LpvRnu/0TwIM2VWAn2+QD0Sef2tJkmbm4n4 8sMenXjx9aK72titvwB8X9rb4ZzZUVpnnr9aeOK9rqOihHJ7R3COPVvj+PkNTD7OI2T1 5NtGzcE6ac6Lj65tEdzCp8urY5qzcVgz/wVvbRg/RPQXvccjMB4UBUSUecXL5U0Gqqry MDgA== X-Received: by 10.152.116.99 with SMTP id jv3mr7080354lab.19.1397041465576; Wed, 09 Apr 2014 04:04:25 -0700 (PDT) Received: from ?IPv6:2a02:6b8::408:fcad:ae61:5dc6:65cb? ([2a02:6b8:0:408:fcad:ae61:5dc6:65cb]) by mx.google.com with ESMTPSA id fa8sm564818lbc.18.2014.04.09.04.04.24 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 09 Apr 2014 04:04:24 -0700 (PDT) Content-Type: text/plain; charset=koi8-r Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: madvise() vs posix_fadvise() From: Dmitry Sivachenko In-Reply-To: <201404031527.59901.jhb@freebsd.org> Date: Wed, 9 Apr 2014 15:04:22 +0400 Content-Transfer-Encoding: quoted-printable Message-Id: <4E4C7AC3-A802-4DFF-8F30-8985CBE2E3D1@gmail.com> References: <201404031230.40380.jhb@freebsd.org> <2CB392D0-5198-41EB-8191-8B02FE432334@gmail.com> <201404031527.59901.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1874) Cc: freebsd-hackers@freebsd.org, Ian Lepore X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2014 11:04:28 -0000 On 03 =C1=D0=D2. 2014 =C7., at 23:27, John Baldwin = wrote: > On Thursday, April 03, 2014 3:10:49 pm Dmitry Sivachenko wrote: >>=20 >> On 03 =C1=D0=D2. 2014 =C7., at 20:30, John Baldwin = wrote: >>=20 >>>=20 >>> The latter. It's sort of like a lazy O_DIRECT. Each time you call = write(2), >>> it tries to move any clean pages from your current sequentially = written >>> stream from inactive to cache, so the pages won't move until a = subsequent >>> write(2) after bufdaemon or the syncer actually forces them to be = written. >>> Unfortunately, it is currently implemented by doing an internal >>> FADV_DONTNEED after each read() or write(). It would be better if = it was >>> implemented as a callback when buffers are completed. >>=20 >>=20 >>=20 >> Sounds like FADV_NOREUSE should be befeficial for any log-writing = program? >> (syslogd, apache, nginx, .....) >=20 > Well, it depends. If you plan on reading the log files, then using = NOREUSE > can potentially make that more expensive as the logs are more likely = to be > out of RAM when you go to read them (even if you have free memory, = mostly > because "cache" isn't perfect, at least in my experience). OTOH, = pagedaemon > (a part of the VM system) should generally pick the log pages to evict = when > needed (and I believe it might do a better job of that in 10 than it = did > previously). I think if you know that the log files are kicking more = useful > things out of RAM and you don't generally plan on reading them (note = that > things like compressing them with gzip counts as reading), then = FADV_NOREUSE > can work fine. Well, just for reference: I do posix_fadvise(fd, 0, 0, POSIX_FADV_NOREUSE) after every log open() = and I see no difference: the only disk load during the day is log file writing and I see Inactive = memory increase steadily all day long. This process of converting Free into Inactive converges when there are = no Free left and the only way to free it is either remove old log files = from disk or do msync(MS_INVALIDATE) on these files. Some of rarely used mmaped data is pushed out of RAM so referencing this = data results in long-running disk reads. [should be rather expected actually, since rev.254304 was done for = EMS/Isilon storage division: now FreeBSD acts like perfect file caching = server :) ]=