From owner-freebsd-fs@FreeBSD.ORG Fri Jun 14 00:52:26 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0CBE3F5D for ; Fri, 14 Jun 2013 00:52:26 +0000 (UTC) (envelope-from beastie@tardisi.com) Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72]) by mx1.freebsd.org (Postfix) with ESMTP id C629015FF for ; Fri, 14 Jun 2013 00:52:25 +0000 (UTC) Received: from ip70-179-144-108.fv.ks.cox.net ([70.179.144.108] helo=zen.lhaven.homeip.net) by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.72) (envelope-from ) id 1UnIFX-00005g-9C for freebsd-fs@freebsd.org; Fri, 14 Jun 2013 00:52:19 +0000 X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 70.179.144.108 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX1/4JoORhAPmozf83kVKfb7aoFG/Mby/QUU= Message-ID: <51BA6941.7040909@tardisi.com> Date: Thu, 13 Jun 2013 19:52:17 -0500 From: The BSD Dreamer User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130516 Thunderbird/17.0.6 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: ZFS triggered 9-STABLE r246646 panic "vdrop: holdcnt 0" References: <513E8E95.6010802@freebsd.org> In-Reply-To: <513E8E95.6010802@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jun 2013 00:52:26 -0000 On 03/11/2013 21:10, Lawrence Stewart wrote: > Hi all, > > I got this panic yesterday. I haven't seen it before (or since), but I > have the crashdump and kernel here if there's additional information I > can provide that would be useful in finding the cause. > > The machine runs ZFS exclusively and was under quite heavy CPU and IO > load at the time of the crash as I was compiling in a VirtualBox VM and > on the host itself, as well as running a full KDE desktop environment. > I'm fairly certain the machine was not swapping at the time of the crash. > > lstewart@lstewart> uname -a > FreeBSD lstewart 9.1-STABLE FreeBSD 9.1-STABLE #8 r246646M: Mon Feb 11 > 14:57:13 EST 2013 > root@lstewart:/usr/obj/usr/src/sys/LSTEWART-DESKTOP amd64 > > lstewart@lstewart> sudo kgdb /boot/kernel/kernel /var/crash/vmcore.0 > > [...] > > (kgdb) bt > #0 doadump (textdump=) at pcpu.h:229 > #1 0xffffffff808e5824 in kern_reboot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:448 > #2 0xffffffff808e5d27 in panic (fmt=0x1
) at > /usr/src/sys/kern/kern_shutdown.c:636 > #3 0xffffffff8097a71e in vdropl (vp=) at > /usr/src/sys/kern/vfs_subr.c:2465 > #4 0xffffffff80b4da2b in vm_page_alloc (object=0xffffffff8132c000, > pindex=143696, req=32) at /usr/src/sys/vm/vm_page.c:1569 > #5 0xffffffff80b3f312 in kmem_back (map=0xfffffe00020000e8, > addr=18446743524542296064, size=131072, flags=705200752) > at /usr/src/sys/vm/vm_kern.c:361 I just came home to find that my system had panic'd (around 11:30am)....and this was the only FreeBSD 9 'panic: vdrop: holdcnt: 0' that I found. The machine runs ZFS exclusively as well....CPU would be busy, since I run BOINC and distributed.net (go Team FreeBSD :) And, IO load would be high from BackupPC_nightly running...out of the box this job starts at 1am, but I had moved it to run at 11am so that it doesn't run into all things that get scheduled in cron around this time, along with all the backups that I'm running... as well as out of the way when I'm checking email and such first thing in the morning over coffee before heading into work. And, it takes a few hours to grind through the 7.2TB zpool... Its possible that this was happening when it was set to 1am, but I never had a crash dump when it had happened and no indication that a panic was why. Though I did later find out that recollindex cleans itself up when something goes wrong by sending TERM to its pgid....and running recollindex as root from cron during this time....means its sending TERM to init. And, not running it anymore seems to have solved that.... and there didn't seem to be any reason to move BackupPC_nightly back. Plus the other problem would have me wake up to find the machine with console screen in single user mode. With this, I came home to gnome login screen.... So, my system is: lchen@zen:~ 102> uname -a FreeBSD zen.lhaven.homeip.net 9.1-RELEASE-p3 FreeBSD 9.1-RELEASE-p3 #0: Mon Apr 29 18:27:25 UTC 2013 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 but, when I try to look at the dump: lchen@zen:~ 103> sudo kgdb /boot/kernel/kernel /var/crash/vmcore.0 Password: GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)... Attempt to extract a component of a value that is not a structure pointer. Attempt to extract a component of a value that is not a structure pointer. #0 0xffffffff808e9ecb in doadump () (kgdb) There's no kernel.symbols either. The only one that is, is the backup of my 9.0 kernel. Is that because I've been using freebsd-update to update? Here's the info.0 file.... lchen@zen:~ 104> sudo cat /var/crash/info.0 Dump header from device /dev/gpt/swap0 Architecture: amd64 Architecture Version: 2 Dump Length: 9172926464B (8747 MB) Blocksize: 512 Dumptime: Thu Jun 13 11:31:10 2013 Hostname: zen.lhaven.homeip.net Magic: FreeBSD Kernel Dump Version String: FreeBSD 9.1-RELEASE-p3 #0: Mon Apr 29 18:27:25 UTC 2013 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC Panic String: vdrop: holdcnt 0 Dump Parity: 4285100545 Bounds: 0 Dump Status: good So, just to see if anything meaningful might result....I move my /etc/make.conf aside and do a "make buildkernel", and tried a kgdb /usr/obj/usr/src/sys/generic/kernel.debug /var/crash/vmcore.0 which get's me this... GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: vdrop: holdcnt 0 cpuid = 1 KDB: stack backtrace: #0 0xffffffff809208d6 at kdb_backtrace+0x66 #1 0xffffffff808ea8ee at panic+0x1ce #2 0xffffffff8097fa86 at vdropl+0x366 #3 0xffffffff80b522ab at vm_page_alloc+0x28b #4 0xffffffff80bd9096 at uma_small_alloc+0x66 #5 0xffffffff80b3b5fa at keg_alloc_slab+0x9a #6 0xffffffff80b3bb72 at keg_fetch_slab+0xb2 #7 0xffffffff80b3bede at zone_fetch_slab+0x3e #8 0xffffffff80b3b229 at zone_alloc_item+0x59 #9 0xffffffff80b3b431 at uma_large_malloc+0x31 #10 0xffffffff808d5a99 at malloc+0xd9 #11 0xffffffff815b28ee at zio_write_bp_init+0x1fe #12 0xffffffff815b2063 at zio_execute+0xc3 #13 0xffffffff815b3fad at zio_ready+0x17d #14 0xffffffff815b2063 at zio_execute+0xc3 #15 0xffffffff8092cf85 at taskqueue_run_locked+0x85 #16 0xffffffff8092df06 at taskqueue_thread_loop+0x46 #17 0xffffffff808bba1f at fork_exit+0x11f Uptime: 15d13h35m36s Dumping 8747 out of 16308 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Reading symbols from /boot/kernel/nullfs.ko...done. Loaded symbols for /boot/kernel/nullfs.ko Reading symbols from /boot/kernel/zfs.ko...done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /boot/kernel/if_tap.ko...done. Loaded symbols for /boot/kernel/if_tap.ko Reading symbols from /boot/kernel/aio.ko...done. Loaded symbols for /boot/kernel/aio.ko Reading symbols from /boot/kernel/accf_data.ko...done. Loaded symbols for /boot/kernel/accf_data.ko Reading symbols from /boot/kernel/accf_http.ko...done. Loaded symbols for /boot/kernel/accf_http.ko Reading symbols from /boot/kernel/coretemp.ko...done. Loaded symbols for /boot/kernel/coretemp.ko Reading symbols from /boot/kernel/cpuctl.ko...done. Loaded symbols for /boot/kernel/cpuctl.ko Reading symbols from /boot/kernel/sem.ko...done. Loaded symbols for /boot/kernel/sem.ko Reading symbols from /boot/modules/cuse4bsd.ko...done. Loaded symbols for /boot/modules/cuse4bsd.ko Reading symbols from /boot/modules/vboxdrv.ko...done. Loaded symbols for /boot/modules/vboxdrv.ko Reading symbols from /boot/modules/nvidia.ko...done. Loaded symbols for /boot/modules/nvidia.ko Reading symbols from /boot/kernel/linux.ko...done. Loaded symbols for /boot/kernel/linux.ko Reading symbols from /boot/kernel/libiconv.ko...done. Loaded symbols for /boot/kernel/libiconv.ko Reading symbols from /boot/kernel/libmchain.ko...done. Loaded symbols for /boot/kernel/libmchain.ko Reading symbols from /boot/kernel/cd9660_iconv.ko...done. Loaded symbols for /boot/kernel/cd9660_iconv.ko Reading symbols from /boot/kernel/msdosfs_iconv.ko...done. Loaded symbols for /boot/kernel/msdosfs_iconv.ko Reading symbols from /boot/kernel/ichwd.ko...done. Loaded symbols for /boot/kernel/ichwd.ko Reading symbols from /boot/kernel/fdescfs.ko...done. Loaded symbols for /boot/kernel/fdescfs.ko Reading symbols from /boot/kernel/ipl.ko...done. Loaded symbols for /boot/kernel/ipl.ko Reading symbols from /boot/modules/vboxnetflt.ko...done. Loaded symbols for /boot/modules/vboxnetflt.ko Reading symbols from /boot/kernel/netgraph.ko...done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_ether.ko...done. Loaded symbols for /boot/kernel/ng_ether.ko Reading symbols from /boot/modules/vboxnetadp.ko...done. Loaded symbols for /boot/modules/vboxnetadp.ko Reading symbols from /usr/local/modules/fuse.ko...done. Loaded symbols for /usr/local/modules/fuse.ko Reading symbols from /boot/kernel/linprocfs.ko...done. Loaded symbols for /boot/kernel/linprocfs.ko Reading symbols from /boot/kernel/linsysfs.ko...done. Loaded symbols for /boot/kernel/linsysfs.ko Reading symbols from /usr/local/libexec/linux_adobe/linux_adobe.ko...done. Loaded symbols for /usr/local/libexec/linux_adobe/linux_adobe.ko Reading symbols from /usr/local/modules/rtc.ko...done. Loaded symbols for /usr/local/modules/rtc.ko #0 doadump (textdump=Variable "textdump" is not available. ) at pcpu.h:224 224 __asm("movq %%gs:0,%0" : "=r" (td)); (kgdb) bt #0 doadump (textdump=Variable "textdump" is not available. ) at pcpu.h:224 #1 0xffffffff808ea3d1 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:448 #2 0xffffffff808ea8c7 in panic (fmt=0x1
) at /usr/src/sys/kern/kern_shutdown.c:636 #3 0xffffffff8097fa86 in vdropl (vp=Variable "vp" is not available. ) at /usr/src/sys/kern/vfs_subr.c:2400 #4 0xffffffff80b522ab in vm_page_alloc (object=0x0, pindex=0, req=32) at /usr/src/sys/vm/vm_page.c:1537 #5 0xffffffff80bd9096 in uma_small_alloc (zone=Variable "zone" is not available. ) at /usr/src/sys/amd64/amd64/uma_machdep.c:58 #6 0xffffffff80b3b5fa in keg_alloc_slab (keg=0xfffffe043ffef0e0, zone=0xfffffe043ffee000, wait=258) at /usr/src/sys/vm/uma_core.c:844 #7 0xffffffff80b3bb72 in keg_fetch_slab (keg=0xfffffe043ffef0e0, zone=0xfffffe043ffee000, flags=2) at /usr/src/sys/vm/uma_core.c:2173 #8 0xffffffff80b3bede in zone_fetch_slab (zone=0xfffffe043ffee000, keg=0xfffffe043ffef0e0, flags=2) at /usr/src/sys/vm/uma_core.c:2233 #9 0xffffffff80b3b229 in zone_alloc_item (zone=0xfffffe043ffee000, udata=0x0, flags=2) at /usr/src/sys/vm/uma_core.c:2490 #10 0xffffffff80b3b431 in uma_large_malloc (size=16384, wait=2) at /usr/src/sys/vm/uma_core.c:3064 #11 0xffffffff808d5a99 in malloc (size=16384, mtp=0xffffffff81734c20, flags=2) at /usr/src/sys/kern/kern_malloc.c:492 #12 0xffffffff815b28ee in zio_write_bp_init () from /boot/kernel/zfs.ko ---Type to continue, or q to quit--- #13 0x0000000000000010 in ?? () #14 0xfffffe022b9726e0 in ?? () #15 0xfffffe03c81a2a50 in ?? () #16 0xffffff801b78e880 in ?? () #17 0xfffffe000e99e000 in ?? () #18 0xffffff8471d93ae0 in ?? () #19 0xffffffff815b2063 in zio_execute () from /boot/kernel/zfs.ko #20 0x0000000000000000 in ?? () #21 0x0000000000000000 in ?? () #22 0xfffffe03c81a2a50 in ?? () #23 0xffffff801b78e880 in ?? () #24 0xfffffe000e99e000 in ?? () #25 0xffffff8471d93b10 in ?? () #26 0xffffffff815b3fad in zio_ready () from /boot/kernel/zfs.ko #27 0xfffffe03c81a2a50 in ?? () #28 0x0000000000000006 in ?? () #29 0x0000000000000006 in ?? () #30 0xffffff8471d93b50 in ?? () #31 0xffffffff815b2063 in zio_execute () from /boot/kernel/zfs.ko #32 0xfffffe0013c79800 in ?? () #33 0xfffffe03c81a2d90 in ?? () #34 0xfffffe0013c70000 in ?? () #35 0x0000000000000001 in ?? () ---Type to continue, or q to quit--- #36 0xfffffe0013c70000 in ?? () #37 0xffffff8471d93bc0 in ?? () #38 0xffffffff8092cf85 in taskqueue_run_locked (queue=0xffffff800904e380) at /usr/src/sys/kern/subr_taskqueue.c:308 Previous frame inner to this frame (corrupt stack?) (kgdb) l *0xffffffff8097fa86 0xffffffff8097fa86 is at /usr/src/sys/kern/vfs_subr.c:2400. 2395 int active; 2396 2397 ASSERT_VI_LOCKED(vp, "vdropl"); 2398 CTR2(KTR_VFS, "%s: vp %p", __func__, vp); 2399 if (vp->v_holdcnt <= 0) 2400 panic("vdrop: holdcnt %d", vp->v_holdcnt); 2401 vp->v_holdcnt--; 2402 if (vp->v_holdcnt > 0) { 2403 VI_UNLOCK(vp); 2404 return; so, it seems to work, but beyond the fact that it says to panic if vp->v_holdcnt is <= 0...don't know how to look to see why this variable had come to be 0, when it thinks it shouldn't have. I have periodic (about twice a year) scrubs enabled on my system, and the zpool for backuppc was last scrubbed on May 24th (it took 47h57m - repaired 0 with 0 errors.) -- Name: Lawrence "The Dreamer" Chen Email: beastie@tardisi.com Snail: 1530 College Ave, A5 Blog: http://lawrencechen.net Manhattan, KS 66502-2768 Phone: 785-789-4132