From owner-freebsd-hackers Tue Mar 12 13:21:46 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from harrier.prod.itd.earthlink.net (harrier.mail.pas.earthlink.net [207.217.120.12]) by hub.freebsd.org (Postfix) with ESMTP id BBB6037B41B for ; Tue, 12 Mar 2002 13:21:15 -0800 (PST) Received: from pool0291.cvx40-bradley.dialup.earthlink.net ([216.244.43.36] helo=mindspring.com) by harrier.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16ktdm-0002xS-00; Tue, 12 Mar 2002 13:17:02 -0800 Message-ID: <3C8E703D.430CEC4E@mindspring.com> Date: Tue, 12 Mar 2002 13:16:45 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Clark C . Evans" Cc: freebsd-hackers@freebsd.org Subject: Re: panic: pmap_enter References: <20020311210332.A38510@doublegemini.com> <1015919910.4901.5.camel@blackbox.pacbell.net> <3C8DBC98.508D76A9@mindspring.com> <20020312100850.A41104@doublegemini.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Clark C . Evans" wrote: > | > It seems to me that you are showing only the last part of the trace, > | > which shows where a second panic occurred. While that may also be an > | > issue the real reason for the panic occurred earlier. Please post the > | > complete trace. > > Thank you Mike, I'll do my best to duplicate it, I'm not an > expert. It seems that I have the problem (panic) when ever > a program core-dumps. But that said, I'm getting core dumps > fairly easily... is the memory file system stable? > > | You faulted on a 4M page mapping for which backing store was > | not assigned.a > > By "backing store" you mean "swap"? I don't have a swap space, > although I do have 1GB memory and I'm not using much memory. No, I mean allocate kernel memory pages. As I said, 4M pages are not permitted to be swapped, so they should never go through the swap code. Because of this, a fault indicating that there is not a page present behind some of the memory is fatal. > | You are not permitted to create 4M pages without assigned > | backing store (basically, you can't page them in and out). > > Ok. This is swap related... I'm running with just a read-only > CD-ROM and a MFS. Must I have a swap? How do I tell FreeBSD > not to use a swap? I have /var and /tmp as a MFS. Right now, there are two possibilities one is extremely ugly. As I told you before: Add DISABLE_PSE to your config file and try it again and tell us what happens. This will actually diagnose which of the two is the problem. Basically, your posted Python program and the fdisk program down in /usr/src/sbin/i386/fdisk do nothing to exercise the 4M page path (i.e. they don't mmap a device). It would be useful to know the virtual address of the panic, to know whether it was because of a pysical page backing for one of the MFS', or for the kernel. Personally, I don't use the MFS code enough, and have no idea which version of it you are using anyway, to know whether or not it tries to use 4M pages or not (basically, if the allocation is on a 4M boundary in KVA space, and goes for at least 4M, then it's possible; that's 4M +/- 4M of space to trigger use of 4M pages). From my reasing of the non-kernel 4M page mapping, it seems to me that it's not possible to end up without backing store in the 4M mapping case for devices, wihch is basically the only other place that 4M pages get invoked. So my guess is that you are exhausting memory, and so your reference is causing it to blow up. If the DISABLE_PSE fixes your problem (by making the system use only 4K pages), then it's most likely that you are running into a kernel image that's smaller than 4M being used with a4M mapping, so the memory at the end of the 4M page is not given physical pages as backing for the mapping (i.e. the mapping is bogus), and when you go to access the memory, it explodes. There are also some subtle bugs in the AMD and Intel CPUs having to do with 4M pages. Disabling the page size extension with the DISABLE_PSE will push your code out of the running as the cause of the problem for this, so you should try this first. If you are personally using 4M pages in the kernel, and getting panics as a result, then you probably don't know what you are doing with the 4M page allocations, and need to back out that code until you do understand. In one of the postings someone (maybe you?) suggested that they were having panics of this sort on an SMP system. THe use of 4M pages in the presence of MESI cache coherency as impemeneted by Intel and AMD is particularly problematic. Again, DISABLE_PSE would be a good diagnostic. In the worst case, you may just be running code from the small period of time when Peter Wemm had reordered some of the assembly code to do some needed cleanup, and tickled the Intel/AMD 4M page bugs, and had to back it out. If you are running -current, make sure you are running recent code, and not some old "stable snapshot of 5.0" that could contain these bugs (and not be as stable as you thought it was at the time you picked the date as your snapshot date). So... Again: add DISABLE_PSE to your config, and tell us if that fixes the problem for you. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message