From owner-freebsd-current Sun Aug 16 07:32:49 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id HAA16392 for freebsd-current-outgoing; Sun, 16 Aug 1998 07:32:49 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id HAA16378 for ; Sun, 16 Aug 1998 07:32:44 -0700 (PDT) (envelope-from tlambert@usr08.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.8.8/8.8.8) id HAA15046; Sun, 16 Aug 1998 07:32:11 -0700 (MST) Received: from usr08.primenet.com(206.165.6.208) via SMTP by smtp02.primenet.com, id smtpd014993; Sun Aug 16 07:32:05 1998 Received: (from tlambert@localhost) by usr08.primenet.com (8.8.5/8.8.5) id HAA16709; Sun, 16 Aug 1998 07:31:59 -0700 (MST) From: Terry Lambert Message-Id: <199808161431.HAA16709@usr08.primenet.com> Subject: Re: Better VM patches (was Tentative fix for VM bug) To: dg@root.com Date: Sun, 16 Aug 1998 14:31:59 +0000 (GMT) Cc: tlambert@primenet.com, current@FreeBSD.ORG, karl@mcs.net In-Reply-To: <199808161315.GAA00184@implode.root.com> from "David Greenman" at Aug 16, 98 06:15:30 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > >John also suggested another change for detecting a case where page > >orphans can be created. in vm_page.c, it's possible that an object > >entry in the page insert function will overwrite an existing entry; > >I added a DIAGNOSTIC panic to catch this when it happens. > > You mean that there can be multiple pages at the same offset? It would > be bad if that happend, but I'm skeptical that it actually does. To elaborate: I think that the behaviour I am seeing in the "mmap'ed file contents, on a page boundary, written to another file" case is a result of a single page being in multiple maps. The code that I have provided so far fails to address this bug. I think a rather elaborate diagnostic (which I am constructing on my home machine) is about the only chance of detecting something like this; each page allocation and deallocation must be resource tracked, at great expense. A bug like this is really the only explanation for non-zeroed page contents from one file appearing in another file; there are other symptoms, but the are attributable to other bugs. This is the only bug that satisfies this case and all the others, all at the same time. Left only the improbable, however unlikely... well, Arthur Conan Doyle fans know the rest... Initially, I thought this was 486dx4 specific (per Julian); however, recent information has come to light that indicates that this is a general problem (ie: it is not a noise problem on my hardware); that is, someone has repeated the problem on Cyrix and Pentium hardware. Right now, I am concentrating on the mmap code (the password file is mmap'ed, and that is generally what shows up in the crontab as the corrupt page contents) and the TLB shootdown code (which could also account for the problem, though it would be a really bizare set of circumstances that would be required to lead to this...). Basically, it's code review time... kill the obvious races and see what's left. 8-(. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message