From owner-freebsd-current@FreeBSD.ORG Wed Aug 12 16:51:14 2009 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A01EE106564A; Wed, 12 Aug 2009 16:51:14 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 1B3728FC41; Wed, 12 Aug 2009 16:51:14 +0000 (UTC) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:38257 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MbH2b-0002sR-5x; Wed, 12 Aug 2009 18:51:12 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id D75234470A; Wed, 12 Aug 2009 18:50:47 +0200 (CEST) Message-Id: From: Thomas Backman To: Alan Cox In-Reply-To: <4A82DFBF.5020101@cs.rice.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v936) Date: Wed, 12 Aug 2009 18:50:45 +0200 References: <20090713181650.GB76464@acme.spoerlein.net> <4A5B7D24.60100@cs.rice.edu> <20090714105245.GR2145@acme.spoerlein.net> <4A82DFBF.5020101@cs.rice.edu> X-Mailer: Apple Mail (2.936) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MbH2b-0002sR-5x. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MbH2b-0002sR-5x 9f7f9ee8c275d8821bede7a8e8dcf3bc Cc: current@freebsd.org, Kip Macy Subject: Re: panic: vm_page_free_toq: freeing mapped page X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Aug 2009 16:51:14 -0000 On Aug 12, 2009, at 17:29, Alan Cox wrote: > Ulrich Sp=F6rlein wrote: >> On Mon, 13.07.2009 at 13:29:56 -0500, Alan Cox wrote: >> >>> Ulrich Sp=F6rlein wrote: >>> >>>> On Mon, 13.07.2009 at 19:15:03 +0200, Ulrich Sp=F6rlein wrote: >>>> >>>>> On Sun, 12.07.2009 at 14:22:23 -0700, Kip Macy wrote: >>>>> >>>>>> On Sun, Jul 12, 2009 at 1:31 PM, Ulrich = Sp=F6rlein>>>>> > wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> 8.0 BETA1 @ r195622 will panic reliably when running the clang =20= >>>>>>> static >>>>>>> analyzer on a buildworld with something like the following =20 >>>>>>> panic: >>>>>>> >>>>>>> panic: vm_page_free_toq: freeing mapped page 0xffffff00c9715b30 >>>>>>> cpuid =3D 1 >>>>>>> KDB: stack backtrace: >>>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >>>>>>> panic() at panic+0x182 >>>>>>> vm_page_free_toq() at vm_page_free_toq+0x1f6 >>>>>>> vm_object_terminate() at vm_object_terminate+0xb7 >>>>>>> vm_object_deallocate() at vm_object_deallocate+0x17a >>>>>>> _vm_map_unlock() at _vm_map_unlock+0x70 >>>>>>> vm_map_remove() at vm_map_remove+0x6f >>>>>>> vmspace_free() at vmspace_free+0x56 >>>>>>> vmspace_exec() at vmspace_exec+0x56 >>>>>>> exec_new_vmspace() at exec_new_vmspace+0x133 >>>>>>> exec_elf32_imgact() at exec_elf32_imgact+0x2ee >>>>>>> kern_execve() at kern_execve+0x3b2 >>>>>>> execve() at execve+0x3d >>>>>>> syscall() at syscall+0x1af >>>>>>> Xfast_syscall() at Xfast_syscall+0xe1 >>>>>>> --- syscall (59, FreeBSD ELF64, execve), rip =3D 0x800c20d0c, =20= >>>>>>> rsp =3D 0x7fffffffd6f8, rbp =3D 0x7fffffffdbf0 --- >>>>>>> >>>>>> Can you try the following change: >>>>>> >>>>>> = http://svn.freebsd.org/viewvc/base/user/kmacy/releng_7_2_fcs/sys/vm/vm_obj= ect.c?r1=3D192842&r2=3D195297 >>>>>> >>>>> Applied this to HEAD by hand an ran with it, it died 20-30 =20 >>>>> minutes into >>>>> the scan-build run. So no luck there. Next up is a test using the >>>>> GENERIC kernel. >>>>> >>>> No improvement with a GENERIC kernel. Next up will be to run this =20= >>>> with >>>> clean sysctl, loader.conf, etc. Then I'll try disabling SMP. >>>> >>>> Does the backtrace above point to any specific subsystem? I'm =20 >>>> using UFS, >>>> ZFS and GELI on this machine and could try a few combinations... >>>> >>> The interesting thing about the backtrace is that it shows a 32-=20 >>> bit i386 executable being started on a 64-bit amd64 machine. I've =20= >>> seen this backtrace once before, and you'll find it in the PR =20 >>> database. In that case, the problem "went away" after the known-=20 >>> to-be-broken ZERO_COPY_SOCKETS option was removed from the =20 >>> reporter's kernel configuration. However, I don't see that as the =20= >>> culprit here. >>> >> >> Hi Alan, first the bad news >> >> I ran this test with a GENERIC kernel, SMP disabled, hw.physmem set =20= >> to 2 >> GB in single user mode, so no other processes or deamons running, >> nothing special in loader.conf except for ZFS and GELI. It reliably >> panics, so nothing new here. >> >> Now the good news, you may be able to crash your own amd64 box in 3 >> minutes by doing: >> >> mkdir /tmp/foo && cd /tmp/foo >> fetch -o- https://www.spoerlein.net/pub/llvm-clang.tar.gz | tar xf - >> while :; do for d in bin sbin usr.bin usr.sbin; do $PWD/scan-build -=20= >> o /dev/null -k make -C /usr/src/$d clean obj depend all; done; done >> >> Please note that scan-build/ccc-analyzer wont actually do anything, =20= >> as >> they cannot create output in /dev/null. So this is just running the >> perl-script and forking make/sh/awk/ccc-analyzer like mad. It does =20= >> not >> survive 3 minutes on my Core2 Duo 3.3 GHz. >> > > Hi Ulrich, > > I finally got a chance to try this workload. I'm afraid that I =20 > can't reproduce the assertion failure on my amd64 test machine. I =20 > left the test running overnight, and it was still going strong this =20= > morning. > > I am using neither ZFS nor GELI. Is it possible for you to repeat =20 > this test without ZFS and/or GELI? > > I would also be curious if anyone else reading this message can =20 > reproduce the assertion failure with the above test. It ran fine for me for an hour as well, assuming the error messages =20 regarding /dev/null/2009-08-12-1/ are normal. No crashes or panics. =20 amd64 with ZFS root (UFS boot) and DTrace. No patch relating to this =20 applied. dmesg: FreeBSD 8.0-BETA2 #3 r196086M: Sun Aug 9 21:03:12 CEST 2009 root@chaos.exscape.org:/usr/obj/usr/src/sys/DTRACE Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 Processor 3200+ (2009.27-MHz K8-class CPU) Origin =3D "AuthenticAMD" Id =3D 0x10ff0 Stepping =3D 0 =20 Features=20 =3D=20 0x78bfbff=20 <=20 FPU=20 ,VME=20 ,DE=20 ,PSE=20 ,TSC=20 ,MSR=20 ,PAE=20 ,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2> AMD Features=3D0xe2500800 AMD Features2=3D0x1 real memory =3D 2147483648 (2048 MB) avail memory =3D 2051895296 (1956 MB) ACPI APIC Table: This module (opensolaris) contains code covered by the Common Development and Distribution License (CDDL) see http://opensolaris.org/os/licensing/opensolaris_license/ ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, 7fef0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.0 (no driver attached) isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.1 (no driver attached) ohci0: mem 0xfe02f000-0xfe02ffff irq =20 21 at device 2.0 on pci0 ohci0: [ITHREAD] usbus0: on ohci0 ehci0: mem 0xfe02e000-0xfe02e0ff =20 irq 22 at device 2.1 on pci0 ehci0: [ITHREAD] usbus1: EHCI version 1.0 usbus1: on ehci0 atapci0: port =20 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfb00-0xfb0f at device 6.0 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] atapci1: port =20 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xf600-0xf60f mem =20 0xfe02b000-0xfe02bfff irq 23 at device 7.0 on pci0 atapci1: [ITHREAD] ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] atapci2: port =20 0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xf100-0xf10f mem =20 0xfe02a000-0xfe02afff irq 21 at device 8.0 on pci0 atapci2: [ITHREAD] ata4: on atapci2 ata4: [ITHREAD] ata5: on atapci2 ata5: [ITHREAD] pcib1: at device 9.0 on pci0 pci1: on pcib1 vgapci0: mem 0xfcff8000-0xfcffbfff,=20 0xfd000000-0xfd7fffff,0xfc000000-0xfc7fffff irq 17 at device 7.0 on pci1 xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xdf00-0xdf7f mem =20 0xfcfff000-0xfcfff07f irq 18 at device 9.0 on pci1 miibus0: on xl0 xlphy0: <3c905C 10/100 internal PHY> PHY 24 on miibus0 xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl0: Ethernet address: 00:50:da:44:c0:4a xl0: [ITHREAD] nfe0: port =20 0xf000-0xf007 mem 0xfe029000-0xfe029fff irq 22 at device 10.0 on pci0 miibus1: on nfe0 e1000phy0: PHY 1 on miibus1 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, =20= 1000baseT-FDX, auto nfe0: Ethernet address: 00:13:d3:a2:aa:0f nfe0: [FILTER] pcib2: at device 11.0 on pci0 pci2: on pcib2 pcib3: at device 12.0 on pci0 pci3: on pcib3 pcib4: at device 13.0 on pci0 pci4: on pcib4 pcib5: at device 14.0 on pci0 pci5: on pcib5 amdtemp0: on hostb3 acpi_tz0: on acpi0 atrtc0: port 0x70-0x73 irq 8 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] cpu0: on acpi0 powernow0: on cpu0 device_attach: powernow0 attach returned 6 orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xcbfff,=20 0xcc000-0xcc7ff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on =20 isa0 ppc0: cannot reserve I/O port range WARNING: ZFS is considered to be an experimental feature in FreeBSD. Timecounter "TSC" frequency 2009269338 Hz quality 800 Timecounters tick every 1.000 msec usbus0: 12Mbps Full Speed USB v1.0 usbus1: 480Mbps High Speed USB v2.0 ZFS NOTICE: system has less than 4GB and prefetch enable is not set... =20= disabling. ZFS filesystem version 13 ZFS storage pool version 13 ad0: 76318MB at ata0-master UDMA100 ad2: 9768MB at ata1-master UDMA100 ugen0.1: at usbus0 uhub0: on =20 usbus0 GEOM: ad2s1: geometry does not match label (255h,63s !=3D 16h,63s). Root mount waiting for: usbus1 usbus0 uhub0: 10 ports with 10 removable, self powered ugen1.1: at usbus1 uhub1: on =20 usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 uhub1: 10 ports with 10 removable, self powered Trying to mount root from zfs:tank/root Regards, Thomas=