Date: Wed, 12 Aug 2009 10:29:03 -0500 From: Alan Cox <alc@cs.rice.edu> To: current@freebsd.org, =?UTF-8?B?VWxyaWNoIFNww7ZybGVpbg==?= <uqs@spoerlein.net> Cc: Alan Cox <alc@cs.rice.edu>, Kip Macy <kmacy@freebsd.org> Subject: Re: panic: vm_page_free_toq: freeing mapped page Message-ID: <4A82DFBF.5020101@cs.rice.edu> In-Reply-To: <20090714105245.GR2145@acme.spoerlein.net> References: <20090713181650.GB76464@acme.spoerlein.net> <4A5B7D24.60100@cs.rice.edu> <20090714105245.GR2145@acme.spoerlein.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Ulrich Spörlein wrote: > On Mon, 13.07.2009 at 13:29:56 -0500, Alan Cox wrote: > >> Ulrich Spörlein wrote: >> >>> On Mon, 13.07.2009 at 19:15:03 +0200, Ulrich Spörlein wrote: >>> >>> >>>> On Sun, 12.07.2009 at 14:22:23 -0700, Kip Macy wrote: >>>> >>>> >>>>> On Sun, Jul 12, 2009 at 1:31 PM, Ulrich Spörlein<uqs@spoerlein.net> wrote: >>>>> >>>>> >>>>>> Hi, >>>>>> >>>>>> 8.0 BETA1 @ r195622 will panic reliably when running the clang static >>>>>> analyzer on a buildworld with something like the following panic: >>>>>> >>>>>> panic: vm_page_free_toq: freeing mapped page 0xffffff00c9715b30 >>>>>> cpuid = 1 >>>>>> KDB: stack backtrace: >>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >>>>>> panic() at panic+0x182 >>>>>> vm_page_free_toq() at vm_page_free_toq+0x1f6 >>>>>> vm_object_terminate() at vm_object_terminate+0xb7 >>>>>> vm_object_deallocate() at vm_object_deallocate+0x17a >>>>>> _vm_map_unlock() at _vm_map_unlock+0x70 >>>>>> vm_map_remove() at vm_map_remove+0x6f >>>>>> vmspace_free() at vmspace_free+0x56 >>>>>> vmspace_exec() at vmspace_exec+0x56 >>>>>> exec_new_vmspace() at exec_new_vmspace+0x133 >>>>>> exec_elf32_imgact() at exec_elf32_imgact+0x2ee >>>>>> kern_execve() at kern_execve+0x3b2 >>>>>> execve() at execve+0x3d >>>>>> syscall() at syscall+0x1af >>>>>> Xfast_syscall() at Xfast_syscall+0xe1 >>>>>> --- syscall (59, FreeBSD ELF64, execve), rip = 0x800c20d0c, rsp = 0x7fffffffd6f8, rbp = 0x7fffffffdbf0 --- >>>>>> >>>>>> >>>>> Can you try the following change: >>>>> >>>>> http://svn.freebsd.org/viewvc/base/user/kmacy/releng_7_2_fcs/sys/vm/vm_object.c?r1=192842&r2=195297 >>>>> >>>>> >>>> Applied this to HEAD by hand an ran with it, it died 20-30 minutes into >>>> the scan-build run. So no luck there. Next up is a test using the >>>> GENERIC kernel. >>>> >>> No improvement with a GENERIC kernel. Next up will be to run this with >>> clean sysctl, loader.conf, etc. Then I'll try disabling SMP. >>> >>> Does the backtrace above point to any specific subsystem? I'm using UFS, >>> ZFS and GELI on this machine and could try a few combinations... >>> >> The interesting thing about the backtrace is that it shows a 32-bit i386 >> executable being started on a 64-bit amd64 machine. I've seen this >> backtrace once before, and you'll find it in the PR database. In that >> case, the problem "went away" after the known-to-be-broken >> ZERO_COPY_SOCKETS option was removed from the reporter's kernel >> configuration. However, I don't see that as the culprit here. >> > > Hi Alan, first the bad news > > I ran this test with a GENERIC kernel, SMP disabled, hw.physmem set to 2 > GB in single user mode, so no other processes or deamons running, > nothing special in loader.conf except for ZFS and GELI. It reliably > panics, so nothing new here. > > Now the good news, you may be able to crash your own amd64 box in 3 > minutes by doing: > > mkdir /tmp/foo && cd /tmp/foo > fetch -o- https://www.spoerlein.net/pub/llvm-clang.tar.gz | tar xf - > while :; do for d in bin sbin usr.bin usr.sbin; do $PWD/scan-build -o /dev/null -k make -C /usr/src/$d clean obj depend all; done; done > > Please note that scan-build/ccc-analyzer wont actually do anything, as > they cannot create output in /dev/null. So this is just running the > perl-script and forking make/sh/awk/ccc-analyzer like mad. It does not > survive 3 minutes on my Core2 Duo 3.3 GHz. > Hi Ulrich, I finally got a chance to try this workload. I'm afraid that I can't reproduce the assertion failure on my amd64 test machine. I left the test running overnight, and it was still going strong this morning. I am using neither ZFS nor GELI. Is it possible for you to repeat this test without ZFS and/or GELI? I would also be curious if anyone else reading this message can reproduce the assertion failure with the above test. Regards, Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A82DFBF.5020101>