From owner-freebsd-stable@FreeBSD.ORG Tue Sep 19 03:54:27 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A03D816A40F; Tue, 19 Sep 2006 03:54:27 +0000 (UTC) (envelope-from frode@nordahl.net) Received: from smtp1.powertech.no (smtp1.powertech.no [195.159.0.145]) by mx1.FreeBSD.org (Postfix) with ESMTP id F010243D45; Tue, 19 Sep 2006 03:54:26 +0000 (GMT) (envelope-from frode@nordahl.net) Received: from [195.159.148.126] (dhcp7.xu.nordahl.net [195.159.148.126]) by smtp1.powertech.no (Postfix) with ESMTP id 966BC80E6; Tue, 19 Sep 2006 05:54:24 +0200 (CEST) In-Reply-To: <200609181614.52260.jhb@freebsd.org> References: <200609162242.56480.jhb@freebsd.org> <200609181614.52260.jhb@freebsd.org> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Frode Nordahl Date: Tue, 19 Sep 2006 05:54:23 +0200 To: John Baldwin X-Mailer: Apple Mail (2.752.2) Cc: freebsd-stable@freebsd.org Subject: Re: RELENG_6 Livelock X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2006 03:54:27 -0000 On 18. sep. 2006, at 22.14, John Baldwin wrote: > On Sunday 17 September 2006 02:05, Frode Nordahl wrote: >> On 17. sep. 2006, at 04.42, John Baldwin wrote: >> >>> On Saturday 16 September 2006 16:55, Frode Nordahl wrote: >>>> On 16. sep. 2006, at 22.22, Frode Nordahl wrote: >>>> >>>>> On 16. sep. 2006, at 22.09, John Baldwin wrote: >>>>> >>>>>> On Saturday 16 September 2006 07:02, Frode Nordahl wrote: >>>>>>> Hello, >>>>>>> >>>>>>> FreeBSD localhost.localdomain 6.2-PRERELEASE FreeBSD 6.2- >>>>>>> PRERELEASE >>>>>>> #1: Wed Sep 13 00:10:04 CEST 2006 >>>>>>> frode@localhost.localdomain:/ >>>>>>> usr/obj/usr/src/sys/PT i386 >>>>>>> >>>>>>> After running some stress tests for 3 days, I wanted to remove >>>>>>> some >>>>>>> large directories. >>>>>> >>>>>> Do you have a coredump? I assume you do from your debug >>>>>> output. Can >>>>>> you download http://www.FreeBSD.org/~jhb/gdb/gdb6, fire up kgdb, >>>>>> and >>>>>> once in kgdb, do 'source /path/to/gdb6' and then run 'ps' and >>>>>> reply >>>>>> with the output from that? >>>>> >>>>> I am sorry, I have not. I tried to call doadump, but there was no >>>>> dumpdevice configured :-( >>>>> >>>>> Somehow I have convinced myself that this was turned on by default >>>>> now, so I have not enabled it explicitly in rc.conf. Is there any >>>>> way to tell DDB what dumpdevice to use directly? >>>>> >>>>> I will configure a dumpdevice and try really hard to make it >>>>> happen >>>>> again. >>>> >>>> I was able to reproduce the livelock again, and this time I had the >>>> system armed with dumpon :-) >>>> >>>> Here is the output you requested: >>>> (kgdb) ps >>>> pid ppid pgrp uid state wmesg wchan cmd >>>> 2535 2499 2535 0 R+ CPU 0 rm >>>> 2534 2499 2534 0 L+ *Giant 0xc6704580 rm >>>> 2533 2499 2533 0 L+ *Giant 0xc6704580 rm >>>> 2532 2499 2532 0 R+ rm >>>> 2531 2499 2531 0 L+ *Giant 0xc6704580 rm >>>> 2499 2496 2499 0 Ss+ ttyin 0xc655d810 bash >>>> 2496 784 2496 0 Rs sshd >>> >>> Ok, do 'lockchain 2534' in kgdb (with gdb6 sourced) and let me >>> see the >>> output from that. >> >> (kgdb) lockchain 2534 >> thread 100038 (pid 2534, rm) blocked on lock 0xc09e6800 "Giant" >> thread 100091 (pid 2535, rm) running on CPU 0 > > Ok, do 'proc 2535' followed by 'where' (kgdb) proc 2535 (kgdb) where #0 doadump () at pcpu.h:165 #1 0xc04733bb in db_fncall (dummy1=1016, dummy2=0, dummy3=-319658232, dummy4=0xecf2670c "@g??") at /usr/src/sys/ddb/db_command.c:492 #2 0xc04731c0 in db_command (last_cmdp=0xc09cb624, cmd_table=0x0, aux_cmd_tablep=0xc092a838, aux_cmd_tablep_end=0xc092a854) at /usr/src/sys/ddb/db_command.c:350 #3 0xc0473288 in db_command_loop () at /usr/src/sys/ddb/db_command.c: 458 #4 0xc0474e95 in db_trap (type=3, code=0) at /usr/src/sys/ddb/ db_main.c:221 #5 0xc0696203 in kdb_trap (type=3, code=0, tf=0xecf2684c) at /usr/src/sys/kern/subr_kdb.c:473 #6 0xc089140c in trap (frame= {tf_fs = -319684600, tf_es = -1066860504, tf_ds = -1064304600, tf_edi = 249, tf_esi = -967491584, tf_ebp = -319657844, tf_isp = -319657864, tf_ebx = -963122944, tf_edx = 0, tf_ecx = -1056878592, tf_eax = 34, tf_trapno = 3, tf_err = 0, tf_eip = -1066836089, tf_cs = 32, tf_eflags = 130, tf_esp = -319657816, tf_ss = -1064914410}) at / usr/src/sys/i386/i386/trap.c:594 #7 0xc087f49a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #8 0xc0695f87 in kdb_enter (msg=0x22
) at cpufunc.h:60 #9 0xc086b216 in siointr1 (com=0xc6554000) at /usr/src/sys/dev/sio/ sio.c:1522 #10 0xc086aff4 in siointr (arg=0xc6554000) at /usr/src/sys/dev/sio/ sio.c:1391 #11 0xc0883491 in intr_execute_handlers (isrc=0xc63854c4, iframe=0xecf268f8) at /usr/src/sys/i386/i386/intr_machdep.c:233 ---Type to continue, or q to quit--- #12 0xc0885852 in lapic_handle_intr (frame= {if_vec = 56, if_fs = 8, if_es = 40, if_ds = 40, if_edi = 36, if_esi = -961899520, if_ebp = -319657648, if_ebx = -873832448, if_edx = 100875, if_ecx = 3111, if_eax = 40, if_eip = -1065518606, if_cs = 32, if_eflags = 643, if_esp = 40, if_ss = 40}) at /usr/src/sys/i386/ i386/local_apic.c:606 #13 0xc087f853 in Xapic_isr1 () at apic_vector.s:110 #14 0xc07d79f2 in ufsdirhash_adjfree (dh=0xc6aa9400, offset=1593184, diff=-16) at /usr/src/sys/ufs/ufs/ufs_dirhash.c:917 #15 0xc07d6478 in ufsdirhash_build (ip=0xc8b3018c) at /usr/src/sys/ufs/ufs/ufs_dirhash.c:246 #16 0xc07d84b1 in ufs_lookup (ap=0xecf26a7c) at /usr/src/sys/ufs/ufs/ufs_lookup.c:192 #17 0xc08a28b8 in VOP_CACHEDLOOKUP_APV (vop=0x28, a=0x18a0b) at vnode_if.c:150 #18 0xc06c985e in vfs_cache_lookup (ap=0x28) at vnode_if.h:82 #19 0xc08a2847 in VOP_LOOKUP_APV (vop=0xc09b5840, a=0xecf26b18) at vnode_if.c:99 #20 0xc06cde31 in lookup (ndp=0xecf26ba0) at vnode_if.h:56 #21 0xc06cd6d2 in namei (ndp=0xecf26ba0) at /usr/src/sys/kern/ vfs_lookup.c:211 #22 0xc06dbe93 in kern_lstat (td=0xc697e900, path=0x18a0b
, pathseg=100875, sbp=0xecf26c74) at /usr/src/sys/kern/vfs_syscalls.c:2147 #23 0xc06dbe2f in lstat (td=0xc697e900, uap=0xecf26d04) at /usr/src/sys/kern/vfs_syscalls.c:2130 ---Type to continue, or q to quit--- #24 0xc0891c83 in syscall (frame= {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 134541896, tf_esi = 134541824, tf_ebp = -1077941288, tf_isp = -319656604, tf_ebx = 672435584, tf_edx = 134541824, tf_ecx = 0, tf_eax = 190, tf_trapno = 12, tf_err = 2, tf_eip = 672322675, tf_cs = 51, tf_eflags = 662, tf_esp = -1077941444, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:983 #25 0xc087f4ef in Xint0x80_syscall () at /usr/src/sys/i386/i386/ exception.s:200 #26 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) -- Frode Nordahl