Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 Sep 2004 17:38:33 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Julian Elischer <julian@elischer.org>
Cc:        freebsd-threads@freebsd.org
Subject:   Re: Unkillable KSE threaded proc
Message-ID:  <16704.52569.375858.857614@grasshopper.cs.duke.edu>
In-Reply-To: <4140C04D.1060906@elischer.org>
References:  <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help

Julian Elischer writes:
 > I think that this would possibly GO AWAY of you disab;ed preemption. 
 > which would make it very hard to debug :-)

Nope, still happens w/o preempt..  And its the "worse" problem of deadlocking
the system rather than just having the process fail to exit.

db> ps
  pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  579 c37e41c0 e8855000 1387   578   579 0004002 [SLPQ ttyin 0xc17df810][SLP] csh
  578 c1817540 e671a000 1387   576   576 0000100 [SLPQ select 0xc06cb704][SLP] sshd
  576 c37e4540 e8857000    0   451   576 0000100 [SLPQ sbwait 0xc1983e84][SLP] sshd
  566 c1a1fc40 e67ba000 1387     1   564 000c482 (threaded)  mx_pingpong
   thread 0xc37944b0 ksegrp 0xc1a20460 [CPU 0]
   thread 0xc3794640 ksegrp 0xc1a20460 [SUSP]
   thread 0xc187e320 ksegrp 0xc1a20460 [RUNQ]
   thread 0xc187e4b0 ksegrp 0xc187fee0 [CPU 1]

db> call db_trace_thread(0xc37944b0, -1)
kdb_enter(c0686ceb,c0645179,fc,c37944b0,c16bd000) at kdb_enter+0x30
siointr1(c16bd000,2,fc,e8842ba0,c0650df2) at siointr1+0xd1
siointr(c16bd000,c37944b0,c1a1fc40,4,c37944b0) at siointr+0x77
intr_execute_handlers(c1556e90,e8842be0,e8842c28,c0639f53,34) at intr_execute_handlers+0x8d
lapic_handle_intr(34) at lapic_handle_intr+0x3b
Xapic_isr1() at Xapic_isr1+0x33
--- interrupt, eip = 0xc04ea44a, esp = 0xe8842c24, ebp = 0xe8842c28 ---
thread_suspend_check(0,246,e8842c60,c0501b86,c37944b0) at thread_suspend_check+0x21f
exit1(c37944b0,9,0,0,c04e1e66) at exit1+0x109
expand_name(c37944b0,9,100,0,0) at expand_name
postsig(9,c37944b0,0,0,0) at postsig+0x204
ast(e8842d48) at ast+0x5e4
doreti_ast() at doreti_ast+0x17
0
db> call db_trace_thread(0xc3794640, -1)
sched_switch(c3794640,c37944b0,0,94bc2c2e,2227b660) at sched_switch+0xd8
mi_switch(1,c37944b0,0,0,0) at mi_switch+0x1c7
thread_single(1,c3794640,0,0,0) at thread_single+0x1d7
exit1(c3794640,9,e8845cbc,e8845ce4,c04e1e66) at exit1+0x115
expand_name(c3794640,9,100,0,0) at expand_name
postsig(9,c3794640,0,0,0) at postsig+0x204
ast(e8845d48) at ast+0x5e4
doreti_ast() at doreti_ast+0x17
0
db> call db_trace_thread(0xc187e320, -1)
sched_switch(c187e320,0,0,9a67657e,e2359ef4) at sched_switch+0xd8
mi_switch(2,0,0,0,0) at mi_switch+0x1c7
ast(e6749d48) at ast+0x4eb
doreti_ast() at doreti_ast+0x17
0
db> call db_trace_thread(0xc187e4b0, -1)
sched_switch(c187fee0,1e,0,1e,0) at sched_switch+0xd8
0
db> show pcpu
cpuid        = 0
curthread    = 0xc37944b0: pid 566 "mx_pingpong"
curpcb       = 0xe8842da0
fpcurthread  = none
idlethread   = 0xc1561640: pid 12 "idle: cpu0"
APIC ID      = 0
currentldt   = 0x30
db> show pcpu 1
cpuid        = 1
curthread    = 0xc187e4b0: pid 566 "mx_pingpong"
curpcb       = 0xe674cda0
fpcurthread  = none
idlethread   = 0xc15614b0: pid 11 "idle: cpu1"
APIC ID      = 1
currentldt   = 0x30


According to kgdb, the lock holder for the proc lock
is 0xc37944b0:

(kgdb) p/x td->td_proc->p_mtx->mtx_lock
$8 = 0xc37944b2


Maybe its some sort of spinlock deadlock.. I'm going to enable witness 
and try again.

Drew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16704.52569.375858.857614>