From owner-freebsd-current@FreeBSD.ORG Thu Aug 13 14:54:06 2009 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2025E106566B for ; Thu, 13 Aug 2009 14:54:06 +0000 (UTC) (envelope-from pho@holm.cc) Received: from relay00.pair.com (relay00.pair.com [209.68.5.9]) by mx1.freebsd.org (Postfix) with SMTP id DB5A48FC4D for ; Thu, 13 Aug 2009 14:54:05 +0000 (UTC) Received: (qmail 32802 invoked from network); 13 Aug 2009 14:27:23 -0000 Received: from 87.58.146.155 (HELO x2.osted.lan) (87.58.146.155) by relay00.pair.com with SMTP; 13 Aug 2009 14:27:23 -0000 X-pair-Authenticated: 87.58.146.155 Received: from x2.osted.lan (localhost.osted.lan [127.0.0.1]) by x2.osted.lan (8.14.2/8.14.2) with ESMTP id n7DERNx9063084; Thu, 13 Aug 2009 16:27:23 +0200 (CEST) (envelope-from pho@x2.osted.lan) Received: (from pho@localhost) by x2.osted.lan (8.14.2/8.14.2/Submit) id n7DERNMS063083; Thu, 13 Aug 2009 16:27:23 +0200 (CEST) (envelope-from pho) Date: Thu, 13 Aug 2009 16:27:23 +0200 From: Peter Holm To: Alan Cox , current@freebsd.org, Kip Macy Message-ID: <20090813142723.GA62890@x2.osted.lan> References: <20090713181650.GB76464@acme.spoerlein.net> <4A5B7D24.60100@cs.rice.edu> <20090714105245.GR2145@acme.spoerlein.net> <4A82DFBF.5020101@cs.rice.edu> <20090813132907.GA1591@roadrunner.spoerlein.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20090813132907.GA1591@roadrunner.spoerlein.net> User-Agent: Mutt/1.4.2.3i Cc: Subject: Re: panic: vm_page_free_toq: freeing mapped page X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Aug 2009 14:54:06 -0000 On Thu, Aug 13, 2009 at 03:29:07PM +0200, Ulrich Spörlein wrote: > On Wed, 12.08.2009 at 10:29:03 -0500, Alan Cox wrote: > > Ulrich Spörlein wrote: > > > On Mon, 13.07.2009 at 13:29:56 -0500, Alan Cox wrote: > > > > > >> Ulrich Spörlein wrote: > > >> > > >>> On Mon, 13.07.2009 at 19:15:03 +0200, Ulrich Spörlein wrote: > > >>> > > >>> > > >>>> On Sun, 12.07.2009 at 14:22:23 -0700, Kip Macy wrote: > > >>>> > > >>>> > > >>>>> On Sun, Jul 12, 2009 at 1:31 PM, Ulrich Spörlein wrote: > > >>>>> > > >>>>> > > >>>>>> Hi, > > >>>>>> > > >>>>>> 8.0 BETA1 @ r195622 will panic reliably when running the clang static > > >>>>>> analyzer on a buildworld with something like the following panic: > > >>>>>> > > >>>>>> panic: vm_page_free_toq: freeing mapped page 0xffffff00c9715b30 > > >>>>>> cpuid = 1 > > >>>>>> KDB: stack backtrace: > > >>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > > >>>>>> panic() at panic+0x182 > > >>>>>> vm_page_free_toq() at vm_page_free_toq+0x1f6 > > >>>>>> vm_object_terminate() at vm_object_terminate+0xb7 > > >>>>>> vm_object_deallocate() at vm_object_deallocate+0x17a > > >>>>>> _vm_map_unlock() at _vm_map_unlock+0x70 > > >>>>>> vm_map_remove() at vm_map_remove+0x6f > > >>>>>> vmspace_free() at vmspace_free+0x56 > > >>>>>> vmspace_exec() at vmspace_exec+0x56 > > >>>>>> exec_new_vmspace() at exec_new_vmspace+0x133 > > >>>>>> exec_elf32_imgact() at exec_elf32_imgact+0x2ee > > >>>>>> kern_execve() at kern_execve+0x3b2 > > >>>>>> execve() at execve+0x3d > > >>>>>> syscall() at syscall+0x1af > > >>>>>> Xfast_syscall() at Xfast_syscall+0xe1 > > >>>>>> --- syscall (59, FreeBSD ELF64, execve), rip = 0x800c20d0c, rsp = 0x7fffffffd6f8, rbp = 0x7fffffffdbf0 --- > > >>>>>> > > >>>>>> > > >>>>> Can you try the following change: > > >>>>> > > >>>>> http://svn.freebsd.org/viewvc/base/user/kmacy/releng_7_2_fcs/sys/vm/vm_object.c?r1=192842&r2=195297 > > >>>>> > > >>>>> > > >>>> Applied this to HEAD by hand an ran with it, it died 20-30 minutes into > > >>>> the scan-build run. So no luck there. Next up is a test using the > > >>>> GENERIC kernel. > > >>>> > > >>> No improvement with a GENERIC kernel. Next up will be to run this with > > >>> clean sysctl, loader.conf, etc. Then I'll try disabling SMP. > > >>> > > >>> Does the backtrace above point to any specific subsystem? I'm using UFS, > > >>> ZFS and GELI on this machine and could try a few combinations... > > >>> > > >> The interesting thing about the backtrace is that it shows a 32-bit i386 > > >> executable being started on a 64-bit amd64 machine. I've seen this > > >> backtrace once before, and you'll find it in the PR database. In that > > >> case, the problem "went away" after the known-to-be-broken > > >> ZERO_COPY_SOCKETS option was removed from the reporter's kernel > > >> configuration. However, I don't see that as the culprit here. > > >> > > > > > > Hi Alan, first the bad news > > > > > > I ran this test with a GENERIC kernel, SMP disabled, hw.physmem set to 2 > > > GB in single user mode, so no other processes or deamons running, > > > nothing special in loader.conf except for ZFS and GELI. It reliably > > > panics, so nothing new here. > > > > > > Now the good news, you may be able to crash your own amd64 box in 3 > > > minutes by doing: > > > > > > mkdir /tmp/foo && cd /tmp/foo > > > fetch -o- https://www.spoerlein.net/pub/llvm-clang.tar.gz | tar xf - > > > while :; do for d in bin sbin usr.bin usr.sbin; do $PWD/scan-build -o /dev/null -k make -C /usr/src/$d clean obj depend all; done; done > > > > > > Please note that scan-build/ccc-analyzer wont actually do anything, as > > > they cannot create output in /dev/null. So this is just running the > > > perl-script and forking make/sh/awk/ccc-analyzer like mad. It does not > > > survive 3 minutes on my Core2 Duo 3.3 GHz. > > > > > > > Hi Ulrich, > > > > I finally got a chance to try this workload. I'm afraid that I can't > > reproduce the assertion failure on my amd64 test machine. I left the > > test running overnight, and it was still going strong this morning. > > > > I am using neither ZFS nor GELI. Is it possible for you to repeat this > > test without ZFS and/or GELI? > > > > I would also be curious if anyone else reading this message can > > reproduce the assertion failure with the above test. > > Now isn't this great :/ > > I haven't tracked the bug for the last couple of weeks, but the system > was updated to recent HEAD and got its ports rebuild (several times). > > I don't know which change "fixed" it, but I think it was the perl > rebuild (I had some trouble with perl5.10 on 8.0 at first). Besides, the > process doing the fork in the backtrace was always the perl binary, > IIRC. > > So right now I'm no longer able to reproduce it myself ... > > Regards, > Uli Using your test scenario I got the panic. - Peter