From owner-freebsd-current@FreeBSD.ORG Wed Aug 12 15:29:14 2009 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04257106566C; Wed, 12 Aug 2009 15:29:14 +0000 (UTC) (envelope-from alc@cs.rice.edu) Received: from mail.cs.rice.edu (mail.cs.rice.edu [128.42.1.31]) by mx1.freebsd.org (Postfix) with ESMTP id C38EE8FC3E; Wed, 12 Aug 2009 15:29:12 +0000 (UTC) Received: from mail.cs.rice.edu (localhost.localdomain [127.0.0.1]) by mail.cs.rice.edu (Postfix) with ESMTP id 40B652C2C60; Wed, 12 Aug 2009 10:29:12 -0500 (CDT) X-Virus-Scanned: by amavis-2.4.0 at mail.cs.rice.edu Received: from mail.cs.rice.edu ([127.0.0.1]) by mail.cs.rice.edu (mail.cs.rice.edu [127.0.0.1]) (amavisd-new, port 10024) with LMTP id NrTTkzGPSHJw; Wed, 12 Aug 2009 10:29:04 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.cs.rice.edu (Postfix) with ESMTP id 3D31E2C2B02; Wed, 12 Aug 2009 10:29:04 -0500 (CDT) Message-ID: <4A82DFBF.5020101@cs.rice.edu> Date: Wed, 12 Aug 2009 10:29:03 -0500 From: Alan Cox User-Agent: Thunderbird 2.0.0.22 (X11/20090724) MIME-Version: 1.0 To: current@freebsd.org, =?UTF-8?B?VWxyaWNoIFNww7ZybGVpbg==?= References: <20090713181650.GB76464@acme.spoerlein.net> <4A5B7D24.60100@cs.rice.edu> <20090714105245.GR2145@acme.spoerlein.net> In-Reply-To: <20090714105245.GR2145@acme.spoerlein.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Cc: Alan Cox , Kip Macy Subject: Re: panic: vm_page_free_toq: freeing mapped page X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Aug 2009 15:29:14 -0000 Ulrich Spörlein wrote: > On Mon, 13.07.2009 at 13:29:56 -0500, Alan Cox wrote: > >> Ulrich Spörlein wrote: >> >>> On Mon, 13.07.2009 at 19:15:03 +0200, Ulrich Spörlein wrote: >>> >>> >>>> On Sun, 12.07.2009 at 14:22:23 -0700, Kip Macy wrote: >>>> >>>> >>>>> On Sun, Jul 12, 2009 at 1:31 PM, Ulrich Spörlein wrote: >>>>> >>>>> >>>>>> Hi, >>>>>> >>>>>> 8.0 BETA1 @ r195622 will panic reliably when running the clang static >>>>>> analyzer on a buildworld with something like the following panic: >>>>>> >>>>>> panic: vm_page_free_toq: freeing mapped page 0xffffff00c9715b30 >>>>>> cpuid = 1 >>>>>> KDB: stack backtrace: >>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >>>>>> panic() at panic+0x182 >>>>>> vm_page_free_toq() at vm_page_free_toq+0x1f6 >>>>>> vm_object_terminate() at vm_object_terminate+0xb7 >>>>>> vm_object_deallocate() at vm_object_deallocate+0x17a >>>>>> _vm_map_unlock() at _vm_map_unlock+0x70 >>>>>> vm_map_remove() at vm_map_remove+0x6f >>>>>> vmspace_free() at vmspace_free+0x56 >>>>>> vmspace_exec() at vmspace_exec+0x56 >>>>>> exec_new_vmspace() at exec_new_vmspace+0x133 >>>>>> exec_elf32_imgact() at exec_elf32_imgact+0x2ee >>>>>> kern_execve() at kern_execve+0x3b2 >>>>>> execve() at execve+0x3d >>>>>> syscall() at syscall+0x1af >>>>>> Xfast_syscall() at Xfast_syscall+0xe1 >>>>>> --- syscall (59, FreeBSD ELF64, execve), rip = 0x800c20d0c, rsp = 0x7fffffffd6f8, rbp = 0x7fffffffdbf0 --- >>>>>> >>>>>> >>>>> Can you try the following change: >>>>> >>>>> http://svn.freebsd.org/viewvc/base/user/kmacy/releng_7_2_fcs/sys/vm/vm_object.c?r1=192842&r2=195297 >>>>> >>>>> >>>> Applied this to HEAD by hand an ran with it, it died 20-30 minutes into >>>> the scan-build run. So no luck there. Next up is a test using the >>>> GENERIC kernel. >>>> >>> No improvement with a GENERIC kernel. Next up will be to run this with >>> clean sysctl, loader.conf, etc. Then I'll try disabling SMP. >>> >>> Does the backtrace above point to any specific subsystem? I'm using UFS, >>> ZFS and GELI on this machine and could try a few combinations... >>> >> The interesting thing about the backtrace is that it shows a 32-bit i386 >> executable being started on a 64-bit amd64 machine. I've seen this >> backtrace once before, and you'll find it in the PR database. In that >> case, the problem "went away" after the known-to-be-broken >> ZERO_COPY_SOCKETS option was removed from the reporter's kernel >> configuration. However, I don't see that as the culprit here. >> > > Hi Alan, first the bad news > > I ran this test with a GENERIC kernel, SMP disabled, hw.physmem set to 2 > GB in single user mode, so no other processes or deamons running, > nothing special in loader.conf except for ZFS and GELI. It reliably > panics, so nothing new here. > > Now the good news, you may be able to crash your own amd64 box in 3 > minutes by doing: > > mkdir /tmp/foo && cd /tmp/foo > fetch -o- https://www.spoerlein.net/pub/llvm-clang.tar.gz | tar xf - > while :; do for d in bin sbin usr.bin usr.sbin; do $PWD/scan-build -o /dev/null -k make -C /usr/src/$d clean obj depend all; done; done > > Please note that scan-build/ccc-analyzer wont actually do anything, as > they cannot create output in /dev/null. So this is just running the > perl-script and forking make/sh/awk/ccc-analyzer like mad. It does not > survive 3 minutes on my Core2 Duo 3.3 GHz. > Hi Ulrich, I finally got a chance to try this workload. I'm afraid that I can't reproduce the assertion failure on my amd64 test machine. I left the test running overnight, and it was still going strong this morning. I am using neither ZFS nor GELI. Is it possible for you to repeat this test without ZFS and/or GELI? I would also be curious if anyone else reading this message can reproduce the assertion failure with the above test. Regards, Alan