From owner-freebsd-hackers  Fri Aug 13  7:12:41 1999
Delivered-To: freebsd-hackers@freebsd.org
Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1])
	by hub.freebsd.org (Postfix) with ESMTP id 504A914DE4
	for <freebsd-hackers@FreeBSD.ORG>; Fri, 13 Aug 1999 07:12:35 -0700 (PDT)
	(envelope-from gallatin@cs.duke.edu)
Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30])
	by duke.cs.duke.edu (8.9.1/8.9.1) with ESMTP id KAA08771;
	Fri, 13 Aug 1999 10:11:54 -0400 (EDT)
Received: (from gallatin@localhost)
	by grasshopper.cs.duke.edu (8.9.3/8.9.1) id KAA59765;
	Fri, 13 Aug 1999 10:11:53 -0400 (EDT)
	(envelope-from gallatin@cs.duke.edu)
From: Andrew Gallatin <gallatin@cs.duke.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Date: Fri, 13 Aug 1999 10:11:53 -0400 (EDT)
To: freebsd-hackers@FreeBSD.ORG
Cc: dillon@backplane.com
Subject: Re: mmap bug
In-Reply-To: <19990813004107.A17205@home.com>
References: <19990812235208.A17058@home.com>
	<199908130534.PAA25953@gizmo.internode.com.au>
	<19990813004107.A17205@home.com>
X-Mailer: VM 6.43 under 20.4 "Emerald" XEmacs  Lucid
Message-ID: <14260.7869.532684.88483@grasshopper.cs.duke.edu>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


Arun Sharma writes:
 > The daemons which are involved in freeing up pages during low memory
 > conditions qualify as system daemons. Making sure that these daemons
 > don't block avoids the deadlock.
 > 
 > 	-Arun

The second solution involves a little more than that.  Such as
blessing "normal" jobs just enough to allow them to get sufficent
resources to avoid a deadlock.

One instance of the mmap lockup involves a case where you've got a
single process dirtying a memory mapped file which is larger than
physical memory.  Assuming an otherwise idle system, nearly all
available memory in the system will belong to the file's object & it
will all be dirty.

At some point, the process will trigger a fault on a non-resident
page.  vm_fault will call the vnode_pager_getpages to read in the
faulting page.  ffs_getpages (let's assume we're using ffs)
will then call ffs_read to read in the pages.  ffs_read will try to
build a cluster.  The deadlock occurs when allocbuf cannot allocate a
page for one of the pages in the cluster.  Here's a stack trace (from
a long, long time ago, May 12th):

db> tr
vm_page_alloc(caa0a074,d1d,0,c58f7ba0,1fc) at vm_page_alloc
allocbuf(c58f7ba0,2000,0,c58c4588,5) at allocbuf+0x3ae
getblk(caa0f8c0,68e,2000,0,0) at getblk+0x32e
cluster_rbuild(caa0f8c0,8000001,0,689,370b0) at cluster_rbuild+0x1df
cluster_read(caa0f8c0,8000001,0,689,2000) at cluster_read+0x2cc
ffs_read(caa12e28) at ffs_read+0x3ea
ffs_getpages(caa12e80) at ffs_getpages+0x22c
vnode_pager_getpages(caa0a074,caa12f14,1,0,c9fcdce0) at vnode_pager_getpages+0x4e
vm_fault(c9fd28c0,48df9000,3,8,c9fcdce0) at vm_fault+0x484
trap_pfault(caa12fb8,1,48df9000) at trap_pfault+0xaa
trap(2f,2f,2f,48df9000,48df9000) at trap+0x1aa
calltrap() at calltrap+0x1c

The real problem is that the pageout daemon cannot push any pages
because (nearly) all the pages available to user-processes are held by 
the mmap'ed object.  The killer is that they are all dirty & that
because we're in the middle of doing a cluster read, the vnode is
locked so the pageout daemon cannot touch them.

A solution would be allowing the faulting process to dip into the
system reserves enough so that the vm_page_alloc will succeed, which
will allow the cluster read to complete.  This will avoid deadlock.

I personally think the first solution (always taking write faults)
would be far, far better.  This would allow the system to avoid
getting anywhere near a deadlock situation & to remain responsive.

I'm afraid that if we go with the second solution, the system would be 
unresponsive until the cluster read completed & the pageout daemon was 
able begin to flush the dirty pages in the offending object.

------------------------------------------------------------------------------
Andrew Gallatin, Sr Systems Programmer	http://www.cs.duke.edu/~gallatin
Duke University				Email: gallatin@cs.duke.edu
Department of Computer Science		Phone: (919) 660-6590


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message