From owner-freebsd-hackers  Fri Feb 22 11: 7:19 2002
Delivered-To: freebsd-hackers@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP id 2F66737B404
	for <hackers@FreeBSD.ORG>; Fri, 22 Feb 2002 11:07:15 -0800 (PST)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.11.6/8.9.1) id g1MJ79S18449;
	Fri, 22 Feb 2002 11:07:09 -0800 (PST)
	(envelope-from dillon)
Date: Fri, 22 Feb 2002 11:07:09 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200202221907.g1MJ79S18449@apollo.backplane.com>
To: Andrew Mobbs <andrewm@chiark.greenend.org.uk>
Cc: hackers@FreeBSD.ORG
Subject: Re: msync performance
References:  <15478.31998.459219.178549@chiark.greenend.org.uk>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG


:I recently raised PR 35195
:
:Details are in the PR, but in summary; performing a large amount of
:random IO to a large file through mmap, on a machine with a fair amount
:of free RAM, will cause a following msync to take a significant amount
:of time.
:
:I believe this is because msync walks the dirty buffer list by age,
:therefor will write blocks out in an order liable to cause a lot of
:disk seeks.
:
:My suggestion for a solution would be before starting the IO, to sort
:the dirty buffer list by location on logical disk, and coalesce
:adjacent blocks where possible.
:
:Before I volunteer to implement something like this, please could
:somebody check I'm correct in my analysis, and comment on the
:feasibility of my suggested solution.
:
:Thanks,
:
:-- 
:Andrew Mobbs - http://www.chiark.greenend.org.uk/~andrewm/

    The problem is typically that the backing file was created
    through ftruncate() and thus has no filesystem blocks allocated to
    it.  Unless you manage the dirty the entire file via your mmap
    in fairly short order, the syncer will come around every so often
    and try to msync it.  Typically this means that only some of the
    pages are dirty and they get filesystem blocks allocated to them
    but there wind up still being a whole lot more pages that have 
    not yet been touched.  The next syncer round will get more of
    these pages but their filesystem blocks will be completely
    fragmented compared to the first syncer round.  And so forth.

    Additionally, memory pressure may force the pageout daemon to
    flush pages to the file.  This will occur mostly randomly.

    The result is a massively fragmented file.  Once you have a file
    that is that badly fragmented it won't matter whether you sort
    the blocks or not.

    The soluion is to not use ftruncate() to create such files but
    to instead pre-create the file by filling it with zero's.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message