Date: Wed, 29 Sep 2004 17:45:31 -0400 (EDT) From: Andrew Gallatin <gallatin@cs.duke.edu> To: Julian Elischer <julian@elischer.org> Cc: freebsd-threads@freebsd.org Subject: Re: easy to reproduce unkillable threads Message-ID: <16731.11515.504636.53058@grasshopper.cs.duke.edu> In-Reply-To: <415B1ED6.8010809@elischer.org> References: <16728.37731.540143.307772@grasshopper.cs.duke.edu> <41589B4A.9080508@elischer.org> <415AB791.10809@freebsd.org> <16730.48642.4481.841374@grasshopper.cs.duke.edu> <415B13E8.2090205@elischer.org> <16731.6010.446877.347190@grasshopper.cs.duke.edu> <415B1ED6.8010809@elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
I tried a -current kernel (w/o your patch) from today (still RELENG_5 userland), and I still see the problem. % ssh scream 'skill -9 -u gallatin' Connection to scream closed by remote host. % ssh scream 'ssh scream 'ps axH | grep testc' 580 ?? SLs 0:00.01 csh -c ps axH | grep testc 586 ?? RL 0:00.00 grep testc 535 p0- WL 0:06.21 ./testcdev On scream's console, send break to debugger..: Stopped at kdb_enter+0x30: leave db> ps pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd 548 c1a39c40 e67ee000 1387 547 548 0004002 [SLPQ ttyin 0xc1830c10][SLP] csh 547 c1a39a80 e67ed000 1387 545 545 0000100 [SLPQ select 0xc071aaa4][SLP] sshd 545 c1817000 e5556000 0 450 545 0000100 [SLPQ sbwait 0xc1991320][SLP] sshd 535 c1a34e00 e67e6000 1387 1 535 020c482 (threaded) testcdev thread 0xc164dc80 ksegrp 0xc15e57e0 [SUSP] 511 c1a34a80 e67e4000 0 1 511 0004002 [SLPQ ttyin 0xc1705810][SLP] getty db> trace 535 sched_switch(c164dc80,c164daf0,1,4ec51334,ed18649a) at sched_switch+0x137 mi_switch(1,c164daf0,0,0,0) at mi_switch+0x1d4 thread_single(1,c164dc80,0,0,0) at thread_single+0x1d7 exit1(c164dc80,9,0,0,c051996e) at exit1+0x115 expand_name(c164dc80,9,100,0,0) at expand_name postsig(9,c164dc80,0,0,0) at postsig+0x204 ast(e52d1d48) at ast+0x5e4 doreti_ast() at doreti_ast+0x17 db> c It seems to be just a problem with skill -9. skill -2 works fine. As I said before, libthr seems to behave differently. Rather than a lingering thread, the polling thread (doing the while(1)) is stuck on the CPU (using 100% of one cpu in a dual system), and the thread which was doing the cv_wait() is stuck with the exact same stack as above: 629 c1a1da80 e67e7000 1387 1 629 0004482 (threaded) testcdev thread 0xc164dc80 ksegrp 0xc15e54d0 [SUSP] thread 0xc1879af0 ksegrp 0xc15e54d0 [CPU 1] db> trace 629 sched_switch(c164dc80,0,1,b5d71f28,b4e1d6c8) at sched_switch+0x137 mi_switch(1,0,c1870880,c164dc80,c164dc80) at mi_switch+0x1d4 thread_single(1,c164dc80,e52d1c54,c1b14100,c164dc80) at thread_single+0x1d7 exit1(c164dc80,9,0,e52d1ce4,c051996e) at exit1+0x115 expand_name(c164dc80,9,100,0,0) at expand_name postsig(9,246,c06e7bd0,36,bfafefb4) at postsig+0x1a4 ast(e52d1d48) at ast+0x5e4 doreti_ast() at doreti_ast+0x17 db> c Drew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16731.11515.504636.53058>