Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Sep 2004 08:50:58 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Julian Elischer <julian@elischer.org>
Cc:        freebsd-threads@freebsd.org
Subject:   Re: Unkillable KSE threaded proc
Message-ID:  <16713.35890.516192.596992@grasshopper.cs.duke.edu>
In-Reply-To: <414942B3.1060703@elischer.org>
References:  <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org> <16704.49447.290897.602540@grasshopper.cs.duke.edu> <4146AAC1.5020701@elischer.org> <16711.383.448500.578640@grasshopper.cs.duke.edu> <414942B3.1060703@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help

Julian Elischer writes:
 > Andrew, please try -current on ts own now..
 > I have checked in some fixes that have helped others.

I just tried, and had 2 different results.
2 system lockups, and one lingering thread.
This is with PREEMPTION.  I'm going to try again in a second w/o PREEMPTION. 

The last system lockup was kinda interesting, here are some details.
For all my test setups, there has been one mx_pingpong running as
root, and one mx_pingpong running as me.

After the skill, a vmstat (running as root) kept going, and showed
that the test was still running (like the signal bounced off of it).
Further confirmation is that the mx_pingpong running as root exited
normally, indicating that the other side had run to completion.

I then killed vmstat and did a 'ps ax'.  The ps got stuck on the
skill'ed mx_pingpong's proc lock (note the address passed to the
mtx_lock in the ps's frame).  At this point, it looked like this:

KDB: enter: Line break on console
[thread 100146]
Stopped at      kdb_enter+0x30: leave
db> sho pcpu
cpuid        = 0
curthread    = 0xc1a15960: pid 561 "ps"
curpcb       = 0xe67b2da0
fpcurthread  = none
idlethread   = 0xc1561640: pid 12 "idle: cpu0"
APIC ID      = 0
currentldt   = 0x30
db>   pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  561 c1a14a80 e67de000    0   541   561 0004002 [CPU 0] ps
  551 c1647e00 e5321000 1387     1   549 000c482 (threaded)  mx_pingpong
   thread 0xc1646c80 ksegrp 0xc15ba690 [CPU 1]
   thread 0xc1646af0 ksegrp 0xc15ba690 [SUSP]
  541 c1a18c40 e67e8000    0   538   541 0004002 [SLPQ pause 0xc1a18c78][SLP] csh
 
<...>

db> tr
kdb_enter(c066f281,46,40,c16f3140,e67b2b14) at kdb_enter+0x30
siointr1(c1637800,0,c066f049,6ad,e67b2afc) at siointr1+0xd1
siointr(c1637800,0,c06a19a0,0,4) at siointr+0x35
intr_execute_handlers(c1556e90,e67b2b14,e67b2b74,c061bf53,34) at intr_execute_handlers+0xb8
lapic_handle_intr(34) at lapic_handle_intr+0x3b
Xapic_isr1() at Xapic_isr1+0x33
--- interrupt, eip = 0xc04cd32b, esp = 0xe67b2b58, ebp = 0xe67b2b74 ---
_mtx_lock_sleep(c1647e6c,c1a15960,0,c065b894,3c5) at _mtx_lock_sleep+0x12e
_mtx_lock_flags(c1647e6c,0,c065b894,3c5,0) at _mtx_lock_flags+0x9f
sysctl_kern_proc(c0687d00,e67b2c88,0,e67b2c10,e67b2c10) at sysctl_kern_proc+0x241
sysctl_root(0,e67b2c7c,3,e67b2c10,c1a15960) at sysctl_root+0x13b
userland_sysctl(c1a15960,e67b2c7c,3,0,bfbfe28c) at userland_sysctl+0x11c
__sysctl(c1a15960,e67b2d14,18,8053000,6) at __sysctl+0xb0
syscall(2f,2f,2f,bfbfe28c,bfbfe2c0) at syscall+0x271
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x280f3ee7, esp = 0xbfbfe22c, ebp = 0xbfbfe258 ---


According to gdb:

0xc04d085d is in sysctl_kern_proc (../../../kern/kern_proc.c:965).
960                             if (p->p_state == PRS_NEW) {
961                                     mtx_unlock_spin(&sched_lock);
962                                     continue;
963                             }
964                             mtx_unlock_spin(&sched_lock);
965                             PROC_LOCK(p);
966                             /*
967                              * Show a user only appropriate processes.
968                              */
969                             if (p_cansee(curthread, p)) {



db> call db_trace_thread(0xc1646c80, -1)
sched_switch(c1646c80,c159f190,2,117,6a5c13ea) at sched_switch+0x16e
mi_switch(2,c1646c80,c1646c80,c06ad340,4) at mi_switch+0x2ad
maybe_preempt(e52d1bec,e52d1b78,c04e7482,c06ad340,c1646c80) at maybe_preempt+0x192
(null)(0,c1646c88,0,c1646c90,0) at 0x240
end(c15ba690,c15ba694,c1646c80,c1646af8,c1646af0) at 0xc15ba690
end(c15e4460,c15e4464,c187a960,c187a968,0) at 0xc1a14a80
<...>
db> call db_trace_thread(0xc1646af0, -1)
sched_switch(c1646af0,0,1,11d,4b34ccaa) at sched_switch+0x16e
mi_switch(1,0,c065cd70,335,c1647e6c) at mi_switch+0x2ad
thread_single(1,0,c0659772,88,e52cec70) at thread_single+0x1d7
exit1(c1646af0,9,c065c386,996,1) at exit1+0xd5
expand_name(c1646af0,9,c065c386,928,0) at expand_name
postsig(9,0,c065f070,100,1020800) at postsig+0x1e0
ast(e52ced48) at ast+0x46e
doreti_ast() at doreti_ast+0x17
0


Drew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16713.35890.516192.596992>