From owner-freebsd-hackers Fri Aug 13 7:12:41 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by hub.freebsd.org (Postfix) with ESMTP id 504A914DE4 for ; Fri, 13 Aug 1999 07:12:35 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.1/8.9.1) with ESMTP id KAA08771; Fri, 13 Aug 1999 10:11:54 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.9.3/8.9.1) id KAA59765; Fri, 13 Aug 1999 10:11:53 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Fri, 13 Aug 1999 10:11:53 -0400 (EDT) To: freebsd-hackers@FreeBSD.ORG Cc: dillon@backplane.com Subject: Re: mmap bug In-Reply-To: <19990813004107.A17205@home.com> References: <19990812235208.A17058@home.com> <199908130534.PAA25953@gizmo.internode.com.au> <19990813004107.A17205@home.com> X-Mailer: VM 6.43 under 20.4 "Emerald" XEmacs Lucid Message-ID: <14260.7869.532684.88483@grasshopper.cs.duke.edu> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Arun Sharma writes: > The daemons which are involved in freeing up pages during low memory > conditions qualify as system daemons. Making sure that these daemons > don't block avoids the deadlock. > > -Arun The second solution involves a little more than that. Such as blessing "normal" jobs just enough to allow them to get sufficent resources to avoid a deadlock. One instance of the mmap lockup involves a case where you've got a single process dirtying a memory mapped file which is larger than physical memory. Assuming an otherwise idle system, nearly all available memory in the system will belong to the file's object & it will all be dirty. At some point, the process will trigger a fault on a non-resident page. vm_fault will call the vnode_pager_getpages to read in the faulting page. ffs_getpages (let's assume we're using ffs) will then call ffs_read to read in the pages. ffs_read will try to build a cluster. The deadlock occurs when allocbuf cannot allocate a page for one of the pages in the cluster. Here's a stack trace (from a long, long time ago, May 12th): db> tr vm_page_alloc(caa0a074,d1d,0,c58f7ba0,1fc) at vm_page_alloc allocbuf(c58f7ba0,2000,0,c58c4588,5) at allocbuf+0x3ae getblk(caa0f8c0,68e,2000,0,0) at getblk+0x32e cluster_rbuild(caa0f8c0,8000001,0,689,370b0) at cluster_rbuild+0x1df cluster_read(caa0f8c0,8000001,0,689,2000) at cluster_read+0x2cc ffs_read(caa12e28) at ffs_read+0x3ea ffs_getpages(caa12e80) at ffs_getpages+0x22c vnode_pager_getpages(caa0a074,caa12f14,1,0,c9fcdce0) at vnode_pager_getpages+0x4e vm_fault(c9fd28c0,48df9000,3,8,c9fcdce0) at vm_fault+0x484 trap_pfault(caa12fb8,1,48df9000) at trap_pfault+0xaa trap(2f,2f,2f,48df9000,48df9000) at trap+0x1aa calltrap() at calltrap+0x1c The real problem is that the pageout daemon cannot push any pages because (nearly) all the pages available to user-processes are held by the mmap'ed object. The killer is that they are all dirty & that because we're in the middle of doing a cluster read, the vnode is locked so the pageout daemon cannot touch them. A solution would be allowing the faulting process to dip into the system reserves enough so that the vm_page_alloc will succeed, which will allow the cluster read to complete. This will avoid deadlock. I personally think the first solution (always taking write faults) would be far, far better. This would allow the system to avoid getting anywhere near a deadlock situation & to remain responsive. I'm afraid that if we go with the second solution, the system would be unresponsive until the cluster read completed & the pageout daemon was able begin to flush the dirty pages in the offending object. ------------------------------------------------------------------------------ Andrew Gallatin, Sr Systems Programmer http://www.cs.duke.edu/~gallatin Duke University Email: gallatin@cs.duke.edu Department of Computer Science Phone: (919) 660-6590 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message