From owner-freebsd-current Sat Jul 11 19:11:49 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id TAA01587 for freebsd-current-outgoing; Sat, 11 Jul 1998 19:11:49 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from smtp01.primenet.com (daemon@smtp01.primenet.com [206.165.6.131]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id TAA01575 for ; Sat, 11 Jul 1998 19:11:46 -0700 (PDT) (envelope-from tlambert@usr08.primenet.com) Received: (from daemon@localhost) by smtp01.primenet.com (8.8.8/8.8.8) id TAA01054; Sat, 11 Jul 1998 19:11:45 -0700 (MST) Received: from usr08.primenet.com(206.165.6.208) via SMTP by smtp01.primenet.com, id smtpd001028; Sat Jul 11 19:11:39 1998 Received: (from tlambert@localhost) by usr08.primenet.com (8.8.5/8.8.5) id TAA29647; Sat, 11 Jul 1998 19:11:36 -0700 (MST) From: Terry Lambert Message-Id: <199807120211.TAA29647@usr08.primenet.com> Subject: Re: Arrgh ! resubscribing again again again.... To: dg@root.com Date: Sun, 12 Jul 1998 02:11:36 +0000 (GMT) Cc: tlambert@primenet.com, current@FreeBSD.ORG In-Reply-To: <199807120115.SAA28466@implode.root.com> from "David Greenman" at Jul 11, 98 06:15:45 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > >I see in the VM code where a SIGKILL could result, but it seems to me that > >the page table entry exists, it just doesn't have pages to back it, and > >when the page to back the entry fails allocation, you get SIGSEGV, since > >it isn't mapped when you do the reference. > > > >Am I reading this code wrong? > > Yes, you are reading the code wrong. A SIGSEGV will only occur when there > is no mapping. If this happend for you, then either there was no mapping (a > programatic error), or there is a bug in the kernel. > In fact, one of the VM system test programs that John and I used frequently > is called "testswap", which does something similar to that suggested above; it > never exited with SIGSEGV in the past. Source attached. [ ... code elided ... ] This code doesn't use a shared memory segment, it uses heap memory. I believe the falure is specific to shared memory and/or mmap. By using dbm, which mmap's its files, I can read the clean pages out of the dabase file backing the object. Then I swap the system heavily, causing the page to be LRU'ed out from the vnode that is backing the object -- but *not* dissociated from the process address space. I keep the database open a very long time. This is typical behaviour for some types of password file using programs that don't explicitly call endpwent. Then the page gets marked dirty by another process. Then I write the page (modify the database), and the page gets written to the wrong file. I am able to get it to fairly consistently corrupt crontab by running cron and having it do something (newsyslog, in my test case) once a minute. It is generally always part of the password dbm contents that are written to the crontab. This example is just to show that there are bugs in the mmap code. Unfortunately, this is not a set of test programs, it's a production system that behaves this way, fairly reliably. Now with a second test case, I can map a very large file, and then rotor through all the pages except one, constantly. I run something else, which grabs and sbrk's back memory, sleeping 20 seconds between iterations, one page more each time, touching the memory that it sbrk's in before giving it back. Once every 20 rotors in the first program, I touch the page I skipped, causing it to be dirty, and be written. I *only* write the page, and I *only* write the page with a page worth of data on a page boundry, so there is no read-before write. Eventually, the program doing the sbrk's SIGSEGV's (signal 11, logged to the console). It's not logical, given the code, but it happens. I suspect that ther page is marked as being in core, but isn't, because it has been improperly reused out from under it (ie: there get to be two mappings, and the page is written out as a result of a write, and having been written is discarded, leaving the other page mapping hanging). I first noticed this problem in 2.2.6-stable. I'm at work right now; the code is pretty obvious, but I can send it to you when I get home, if you want. You will need enough disk to hold the large mapped file, which the shell script creates via dd of /dev/zero. I generally run this on a 16M system with 48M of swap, making the file approximately 100M; if you have more than this, you will need a bigger file; the point is to cause thrashing. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message