Date: Mon, 6 Mar 1995 11:17:46 -0500 From: starkhome!gene@sbstark.cs.sunysb.edu (Gene Stark) To: davidg@Root.COM Cc: current@FreeBSD.org, dyson@Root.COM Subject: Page fault panics during make world in -current Message-ID: <199503061617.LAA04199@starkhome.cs.sunysb.edu> In-Reply-To: David Greenman's message of Mon, 06 Mar 1995 07:34:32 -0800 <199503061534.HAA00614@corbin.Root.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
> The code in vfs_bio.c is quite complex. John and I have each gone through >this several times trying to find problems like you've mentioned. We're pretty >sure that the page in question is always made 'busy' or 'bmapped' before any >calls to VM_WAIT (or any other sleep) could otherwise lose the page. I'm not >saying that we might not have missed something...but we have looked at this >specific potential problem more than once. The object itself can't go away >because a reference is held to it. OK, I understand, but the current instability of the system seems to indicate some sort of subtle problem, so I figure having a fresh eye take a look at the code might stand a chance of finding something. I hope you'll pardon me if I "find" stuff that isn't a problem, as the assumptions/invariants, etc. that are inherent in this code take awhile to flesh out by reading the code over and over. I am still concerned about line 1046 of vfs_bio.c, though. At line 1031, m is determined to be either invalid or busy. At line 1046 there is a possibility of sleeping in the VM_WAIT. If m is invalid, then I don't think there is anything stopping a pager from replacing m in the object with another page during the sleep, so that when we wake up again, m isn't a reference to the proper page in this object any more. If m was busy, of course, this can't happen, because the pagers respect the busy flags and don't replace the pages in this case. I have the feeling a good test to exercise some of these potential problems would be to mmap() a file, then start accessing it via the mapped addresses, concurrently with another process that repeatedly truncates and rewrites it. Do you have a test like this? - Gene
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199503061617.LAA04199>