Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 19 Sep 2006 05:54:23 +0200
From:      Frode Nordahl <frode@nordahl.net>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: RELENG_6 Livelock
Message-ID:  <AA1A02F8-5E60-42BD-9114-2050AD59A7F3@nordahl.net>
In-Reply-To: <200609181614.52260.jhb@freebsd.org>
References:  <FC17EAA4-B5B1-48E4-BA69-1D8EF4E0F047@nordahl.net> <200609162242.56480.jhb@freebsd.org> <E1BEEA4F-FADF-4AA8-8F82-CC3DB5AA9AC3@nordahl.net> <200609181614.52260.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On 18. sep. 2006, at 22.14, John Baldwin wrote:

> On Sunday 17 September 2006 02:05, Frode Nordahl wrote:
>> On 17. sep. 2006, at 04.42, John Baldwin wrote:
>>
>>> On Saturday 16 September 2006 16:55, Frode Nordahl wrote:
>>>> On 16. sep. 2006, at 22.22, Frode Nordahl wrote:
>>>>
>>>>> On 16. sep. 2006, at 22.09, John Baldwin wrote:
>>>>>
>>>>>> On Saturday 16 September 2006 07:02, Frode Nordahl wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> FreeBSD localhost.localdomain 6.2-PRERELEASE FreeBSD 6.2-
>>>>>>> PRERELEASE
>>>>>>> #1: Wed Sep 13 00:10:04 CEST 2006
>>>>>>> frode@localhost.localdomain:/
>>>>>>> usr/obj/usr/src/sys/PT  i386
>>>>>>>
>>>>>>> After running some stress tests for 3 days, I wanted to remove
>>>>>>> some
>>>>>>> large directories.
>>>>>>
>>>>>> Do you have a coredump?  I assume you do from your debug
>>>>>> output.  Can
>>>>>> you download http://www.FreeBSD.org/~jhb/gdb/gdb6, fire up kgdb,
>>>>>> and
>>>>>> once in kgdb, do 'source /path/to/gdb6' and then run 'ps' and  
>>>>>> reply
>>>>>> with the output from that?
>>>>>
>>>>> I am sorry, I have not. I tried to call doadump, but there was no
>>>>> dumpdevice configured :-(
>>>>>
>>>>> Somehow I have convinced myself that this was turned on by default
>>>>> now, so I have not enabled it explicitly in rc.conf. Is there any
>>>>> way to tell DDB what dumpdevice to use directly?
>>>>>
>>>>> I will configure a dumpdevice and try really hard to make it  
>>>>> happen
>>>>> again.
>>>>
>>>> I was able to reproduce the livelock again, and this time I had the
>>>> system armed with dumpon :-)
>>>>
>>>> Here is the output you requested:
>>>> (kgdb) ps
>>>>    pid  ppid  pgrp   uid   state   wmesg     wchan    cmd
>>>> 2535  2499  2535     0  R+      CPU  0              rm
>>>> 2534  2499  2534     0  L+     *Giant    0xc6704580 rm
>>>> 2533  2499  2533     0  L+     *Giant    0xc6704580 rm
>>>> 2532  2499  2532     0  R+                          rm
>>>> 2531  2499  2531     0  L+     *Giant    0xc6704580 rm
>>>> 2499  2496  2499     0  Ss+     ttyin    0xc655d810 bash
>>>> 2496   784  2496     0  Rs                          sshd
>>>
>>> Ok, do 'lockchain 2534' in kgdb (with gdb6 sourced) and let me  
>>> see the
>>> output from that.
>>
>> (kgdb) lockchain 2534
>> thread 100038 (pid 2534, rm) blocked on lock 0xc09e6800 "Giant"
>> thread 100091 (pid 2535, rm) running on CPU 0
>
> Ok, do 'proc 2535' followed by 'where'

(kgdb) proc 2535
(kgdb) where
#0  doadump () at pcpu.h:165
#1  0xc04733bb in db_fncall (dummy1=1016, dummy2=0, dummy3=-319658232,
     dummy4=0xecf2670c "@g??") at /usr/src/sys/ddb/db_command.c:492
#2  0xc04731c0 in db_command (last_cmdp=0xc09cb624, cmd_table=0x0,
     aux_cmd_tablep=0xc092a838, aux_cmd_tablep_end=0xc092a854)
     at /usr/src/sys/ddb/db_command.c:350
#3  0xc0473288 in db_command_loop () at /usr/src/sys/ddb/db_command.c: 
458
#4  0xc0474e95 in db_trap (type=3, code=0) at /usr/src/sys/ddb/ 
db_main.c:221
#5  0xc0696203 in kdb_trap (type=3, code=0, tf=0xecf2684c)
     at /usr/src/sys/kern/subr_kdb.c:473
#6  0xc089140c in trap (frame=
       {tf_fs = -319684600, tf_es = -1066860504, tf_ds = -1064304600,  
tf_edi = 249, tf_esi = -967491584, tf_ebp = -319657844, tf_isp =  
-319657864, tf_ebx = -963122944, tf_edx = 0, tf_ecx = -1056878592,  
tf_eax = 34, tf_trapno = 3, tf_err = 0, tf_eip = -1066836089, tf_cs =  
32, tf_eflags = 130, tf_esp = -319657816, tf_ss = -1064914410}) at / 
usr/src/sys/i386/i386/trap.c:594
#7  0xc087f49a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#8  0xc0695f87 in kdb_enter (msg=0x22 <Address 0x22 out of bounds>)
     at cpufunc.h:60
#9  0xc086b216 in siointr1 (com=0xc6554000) at /usr/src/sys/dev/sio/ 
sio.c:1522
#10 0xc086aff4 in siointr (arg=0xc6554000) at /usr/src/sys/dev/sio/ 
sio.c:1391
#11 0xc0883491 in intr_execute_handlers (isrc=0xc63854c4,  
iframe=0xecf268f8)
     at /usr/src/sys/i386/i386/intr_machdep.c:233
---Type <return> to continue, or q <return> to quit---
#12 0xc0885852 in lapic_handle_intr (frame=
       {if_vec = 56, if_fs = 8, if_es = 40, if_ds = 40, if_edi = 36,  
if_esi = -961899520, if_ebp = -319657648, if_ebx = -873832448, if_edx  
= 100875, if_ecx = 3111, if_eax = 40, if_eip = -1065518606, if_cs =  
32, if_eflags = 643, if_esp = 40, if_ss = 40}) at /usr/src/sys/i386/ 
i386/local_apic.c:606
#13 0xc087f853 in Xapic_isr1 () at apic_vector.s:110
#14 0xc07d79f2 in ufsdirhash_adjfree (dh=0xc6aa9400, offset=1593184,  
diff=-16)
     at /usr/src/sys/ufs/ufs/ufs_dirhash.c:917
#15 0xc07d6478 in ufsdirhash_build (ip=0xc8b3018c)
     at /usr/src/sys/ufs/ufs/ufs_dirhash.c:246
#16 0xc07d84b1 in ufs_lookup (ap=0xecf26a7c)
     at /usr/src/sys/ufs/ufs/ufs_lookup.c:192
#17 0xc08a28b8 in VOP_CACHEDLOOKUP_APV (vop=0x28, a=0x18a0b) at  
vnode_if.c:150
#18 0xc06c985e in vfs_cache_lookup (ap=0x28) at vnode_if.h:82
#19 0xc08a2847 in VOP_LOOKUP_APV (vop=0xc09b5840, a=0xecf26b18)
     at vnode_if.c:99
#20 0xc06cde31 in lookup (ndp=0xecf26ba0) at vnode_if.h:56
#21 0xc06cd6d2 in namei (ndp=0xecf26ba0) at /usr/src/sys/kern/ 
vfs_lookup.c:211
#22 0xc06dbe93 in kern_lstat (td=0xc697e900,
     path=0x18a0b <Address 0x18a0b out of bounds>, pathseg=100875,
     sbp=0xecf26c74) at /usr/src/sys/kern/vfs_syscalls.c:2147
#23 0xc06dbe2f in lstat (td=0xc697e900, uap=0xecf26d04)
     at /usr/src/sys/kern/vfs_syscalls.c:2130
---Type <return> to continue, or q <return> to quit---
#24 0xc0891c83 in syscall (frame=
       {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 134541896,  
tf_esi = 134541824, tf_ebp = -1077941288, tf_isp = -319656604, tf_ebx  
= 672435584, tf_edx = 134541824, tf_ecx = 0, tf_eax = 190, tf_trapno  
= 12, tf_err = 2, tf_eip = 672322675, tf_cs = 51, tf_eflags = 662,  
tf_esp = -1077941444, tf_ss = 59})
     at /usr/src/sys/i386/i386/trap.c:983
#25 0xc087f4ef in Xint0x80_syscall () at /usr/src/sys/i386/i386/ 
exception.s:200
#26 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

--
Frode Nordahl






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AA1A02F8-5E60-42BD-9114-2050AD59A7F3>