Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 8 Feb 2001 23:33:01 -0500 (EST)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Dag-Erling Smorgrav <des@ofug.org>
Cc:        Julian Elischer <julian@elischer.org>, Josef Karthauser <joe@tao.org.uk>, Robert Watson <rwatson@FreeBSD.ORG>, Brian Somers <brian@Awfulhak.org>, Bruce Evans <bde@zeta.org.au>, freebsd-current@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG
Subject:   Re: What's changed recently with vmware/linuxemu/file I/O
Message-ID:  <14979.29437.518299.842853@grasshopper.cs.duke.edu>
In-Reply-To: <xzp4ry5gve0.fsf@flood.ping.uio.no>
References:  <xzpsnlqmh0o.fsf@flood.ping.uio.no> <Pine.NEB.3.96L.1010207220443.19807J-100000@fledge.watson.org> <20010208113519.A789@tao.org.uk> <3A828C2C.F7CDA809@elischer.org> <xzp4ry5gve0.fsf@flood.ping.uio.no>

next in thread | previous in thread | raw e-mail | index | archive | help

Dag-Erling Smorgrav writes:
 > Julian Elischer <julian@elischer.org> writes:
 > > I believe that vmware mmaps a region of memory and then somehow syncs 
 > > it to disk. (It is certainly doing something like it here).
 > 
 > Theory: VMWare mmaps a region of memory corresponding to the virtual
 > machine's "physical" RAM, then touches every page during startup.
 > Unless some form of clustering is done, this causes 16384 write
 > operations for a 64 MB virtual machine...
 > 

Pretty much.  But the issue is that this should never hit the disk
unless we're under memory pressure because it is mapped MAP_NOSYNC
(actually the file is unlinked prior to the mmap() and a heuristic in
vm_mmap() detects this and sets MAP_NOSYNC).

The real problem is that our MAP_NOSYNC doesn't fully work in at least
one major case.  As I understand it, the technique we use is to set
the MAP_ENTRY_NOSYNC in the map entry at mmap time. On a write fault,
PG_NOSYNC is set in the page's flags.  A lazy msync will skip
PG_NOSYNC pages.

The problem comes when a page is read from prior to being written
to.  The page gets mapped in read/write and we don't take a write
fault, so the PG_NOSYNC flag never gets set.  (This accounts for the
flurry of disk i/o shortly after vmware starts).  When the pages get
sunk to disk, the vnode is locked and the application will freeze in a
"vmpfw" 

The following patch sets PG_NOSYNC on faults other than write faults.
This seems to work for my test program, and for vmware (I've only very
briefly tested it).  Assuming that it is correct, the code around it
should be reorganized somewhat.   This is against -stable, as I don't
have any -current i386s..

Index: vm_fault.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_fault.c,v
retrieving revision 1.108.2.2
diff -u -r1.108.2.2 vm_fault.c
--- vm_fault.c	2000/08/04 22:31:11	1.108.2.2
+++ vm_fault.c	2001/02/08 23:04:02
@@ -804,6 +804,10 @@
 			}
 			vm_page_dirty(fs.m);
 			vm_pager_page_unswapped(fs.m);
+		} else {
+			if ((fs.entry->eflags & MAP_ENTRY_NOSYNC) && 
+			    (fs.m->dirty == 0))
+				vm_page_flag_set(fs.m, PG_NOSYNC);
 		}
 	}
 


Cheers,

Drew

------------------------------------------------------------------------------
Andrew Gallatin, Sr Systems Programmer	http://www.cs.duke.edu/~gallatin
Duke University				Email: gallatin@cs.duke.edu
Department of Computer Science		Phone: (919) 660-6590


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14979.29437.518299.842853>