Date: Thu, 13 Aug 2009 16:27:23 +0200 From: Peter Holm <pho@freebsd.org> To: Alan Cox <alc@cs.rice.edu>, current@freebsd.org, Kip Macy <kmacy@freebsd.org> Subject: Re: panic: vm_page_free_toq: freeing mapped page Message-ID: <20090813142723.GA62890@x2.osted.lan> In-Reply-To: <20090813132907.GA1591@roadrunner.spoerlein.net> References: <20090713181650.GB76464@acme.spoerlein.net> <4A5B7D24.60100@cs.rice.edu> <20090714105245.GR2145@acme.spoerlein.net> <4A82DFBF.5020101@cs.rice.edu> <20090813132907.GA1591@roadrunner.spoerlein.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 13, 2009 at 03:29:07PM +0200, Ulrich Spörlein wrote: > On Wed, 12.08.2009 at 10:29:03 -0500, Alan Cox wrote: > > Ulrich Spörlein wrote: > > > On Mon, 13.07.2009 at 13:29:56 -0500, Alan Cox wrote: > > > > > >> Ulrich Spörlein wrote: > > >> > > >>> On Mon, 13.07.2009 at 19:15:03 +0200, Ulrich Spörlein wrote: > > >>> > > >>> > > >>>> On Sun, 12.07.2009 at 14:22:23 -0700, Kip Macy wrote: > > >>>> > > >>>> > > >>>>> On Sun, Jul 12, 2009 at 1:31 PM, Ulrich Spörlein<uqs@spoerlein.net> wrote: > > >>>>> > > >>>>> > > >>>>>> Hi, > > >>>>>> > > >>>>>> 8.0 BETA1 @ r195622 will panic reliably when running the clang static > > >>>>>> analyzer on a buildworld with something like the following panic: > > >>>>>> > > >>>>>> panic: vm_page_free_toq: freeing mapped page 0xffffff00c9715b30 > > >>>>>> cpuid = 1 > > >>>>>> KDB: stack backtrace: > > >>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > > >>>>>> panic() at panic+0x182 > > >>>>>> vm_page_free_toq() at vm_page_free_toq+0x1f6 > > >>>>>> vm_object_terminate() at vm_object_terminate+0xb7 > > >>>>>> vm_object_deallocate() at vm_object_deallocate+0x17a > > >>>>>> _vm_map_unlock() at _vm_map_unlock+0x70 > > >>>>>> vm_map_remove() at vm_map_remove+0x6f > > >>>>>> vmspace_free() at vmspace_free+0x56 > > >>>>>> vmspace_exec() at vmspace_exec+0x56 > > >>>>>> exec_new_vmspace() at exec_new_vmspace+0x133 > > >>>>>> exec_elf32_imgact() at exec_elf32_imgact+0x2ee > > >>>>>> kern_execve() at kern_execve+0x3b2 > > >>>>>> execve() at execve+0x3d > > >>>>>> syscall() at syscall+0x1af > > >>>>>> Xfast_syscall() at Xfast_syscall+0xe1 > > >>>>>> --- syscall (59, FreeBSD ELF64, execve), rip = 0x800c20d0c, rsp = 0x7fffffffd6f8, rbp = 0x7fffffffdbf0 --- > > >>>>>> > > >>>>>> > > >>>>> Can you try the following change: > > >>>>> > > >>>>> http://svn.freebsd.org/viewvc/base/user/kmacy/releng_7_2_fcs/sys/vm/vm_object.c?r1=192842&r2=195297 > > >>>>> > > >>>>> > > >>>> Applied this to HEAD by hand an ran with it, it died 20-30 minutes into > > >>>> the scan-build run. So no luck there. Next up is a test using the > > >>>> GENERIC kernel. > > >>>> > > >>> No improvement with a GENERIC kernel. Next up will be to run this with > > >>> clean sysctl, loader.conf, etc. Then I'll try disabling SMP. > > >>> > > >>> Does the backtrace above point to any specific subsystem? I'm using UFS, > > >>> ZFS and GELI on this machine and could try a few combinations... > > >>> > > >> The interesting thing about the backtrace is that it shows a 32-bit i386 > > >> executable being started on a 64-bit amd64 machine. I've seen this > > >> backtrace once before, and you'll find it in the PR database. In that > > >> case, the problem "went away" after the known-to-be-broken > > >> ZERO_COPY_SOCKETS option was removed from the reporter's kernel > > >> configuration. However, I don't see that as the culprit here. > > >> > > > > > > Hi Alan, first the bad news > > > > > > I ran this test with a GENERIC kernel, SMP disabled, hw.physmem set to 2 > > > GB in single user mode, so no other processes or deamons running, > > > nothing special in loader.conf except for ZFS and GELI. It reliably > > > panics, so nothing new here. > > > > > > Now the good news, you may be able to crash your own amd64 box in 3 > > > minutes by doing: > > > > > > mkdir /tmp/foo && cd /tmp/foo > > > fetch -o- https://www.spoerlein.net/pub/llvm-clang.tar.gz | tar xf - > > > while :; do for d in bin sbin usr.bin usr.sbin; do $PWD/scan-build -o /dev/null -k make -C /usr/src/$d clean obj depend all; done; done > > > > > > Please note that scan-build/ccc-analyzer wont actually do anything, as > > > they cannot create output in /dev/null. So this is just running the > > > perl-script and forking make/sh/awk/ccc-analyzer like mad. It does not > > > survive 3 minutes on my Core2 Duo 3.3 GHz. > > > > > > > Hi Ulrich, > > > > I finally got a chance to try this workload. I'm afraid that I can't > > reproduce the assertion failure on my amd64 test machine. I left the > > test running overnight, and it was still going strong this morning. > > > > I am using neither ZFS nor GELI. Is it possible for you to repeat this > > test without ZFS and/or GELI? > > > > I would also be curious if anyone else reading this message can > > reproduce the assertion failure with the above test. > > Now isn't this great :/ > > I haven't tracked the bug for the last couple of weeks, but the system > was updated to recent HEAD and got its ports rebuild (several times). > > I don't know which change "fixed" it, but I think it was the perl > rebuild (I had some trouble with perl5.10 on 8.0 at first). Besides, the > process doing the fork in the backtrace was always the perl binary, > IIRC. > > So right now I'm no longer able to reproduce it myself ... > > Regards, > Uli Using your test scenario I got the panic. - Peter
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090813142723.GA62890>