FreeBSD Mail Archives

Date:      Fri, 22 Feb 2002 12:42:04 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Andrew Mobbs <andrewm@chiark.greenend.org.uk>
Cc:        hackers@FreeBSD.ORG
Subject:   Re2: msync performance
Message-ID:  <200202222042.g1MKg4u22700@apollo.backplane.com>
References:   <15478.31998.459219.178549@chiark.greenend.org.uk>

index | next in thread | previous in thread | raw e-mail



:I recently raised PR 35195
:
:Details are in the PR, but in summary; performing a large amount of
:random IO to a large file through mmap, on a machine with a fair amount
:of free RAM, will cause a following msync to take a significant amount
:of time.
:
:I believe this is because msync walks the dirty buffer list by age,
:therefor will write blocks out in an order liable to cause a lot of
:disk seeks.
:
:My suggestion for a solution would be before starting the IO, to sort
:the dirty buffer list by location on logical disk, and coalesce
:adjacent blocks where possible.
:
:Before I volunteer to implement something like this, please could
:somebody check I'm correct in my analysis, and comment on the
:feasibility of my suggested solution.
:
:Thanks,
:
:-- 
:Andrew Mobbs - http://www.chiark.greenend.org.uk/~andrewm/

    I've looked at this some more.  I can fairly trivially improve
    sequential write efficiency of msync() is called on a range
    of dirty pages, and I can use the same code when msync() is
    called on a complete file *IF* the file is fairly small
    (no more then a hundred pages or so).

    But we have a serious problem when msync() is called on a
    very large file that may only contain a few dirty pages.
    For example, if you have a 20GB file and you are mmap()ing
    portions of it, we can't iterate through the file offsets
    sequentially without eating an enormous amount of cpu
    (as in several seconds worth of cpu or even several minutes).

    In this case we have to scan the object page list, which is
    not sorted.  Even so the existing msync() code *DOES*
    cluster pages together into 64K chunks (though I notice that it
    does not appear to cluster the raw I/O).

    So, this falls back to your suggested solution.... sort
    object->memq (it's the actual page queue that is the problem,
    not the object queue).  Looking at it some more I believe
    this may be a viable solution.  I am going to work something
    up.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200202222042.g1MKg4u22700>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation