From owner-freebsd-threads@FreeBSD.ORG Tue Sep 14 15:36:53 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2AE0416A4CE; Tue, 14 Sep 2004 15:36:53 +0000 (GMT) Received: from pimout1-ext.prodigy.net (pimout1-ext.prodigy.net [207.115.63.77]) by mx1.FreeBSD.org (Postfix) with ESMTP id C3B3843D53; Tue, 14 Sep 2004 15:36:51 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (adsl-68-120-129-148.dsl.snfc21.pacbell.net [68.120.129.148])i8EFajuV087474; Tue, 14 Sep 2004 11:36:47 -0400 Message-ID: <4147100C.8000005@elischer.org> Date: Tue, 14 Sep 2004 08:36:44 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4b) Gecko/20030524 X-Accept-Language: en, hu MIME-Version: 1.0 To: Andrew Gallatin References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org> <16704.49447.290897.602540@grasshopper.cs.duke.edu> <4146AAC1.5020701@elischer.org> <16711.383.448500.578640@grasshopper.cs.duke.edu> In-Reply-To: <16711.383.448500.578640@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Sep 2004 15:36:53 -0000 "bugger" Andrew Gallatin wrote: > Julian Elischer writes: > > Andrew Gallatin wrote: > > > Julian Elischer writes: > > > > > > > > > >Maybe this would be easier to debug if I disabled preemption? > > > > > > > > > > > > > > > > > I think that this would possibly GO AWAY of you disab;ed preemption. > > > > which would make it very hard to debug :-) > > > > > > > > > > Yes and no. You initially asked me to try in -current because of > > > some changes you'd made to the exit code. RELENG_5 (with the old > > > exit code and no preemption) shows a different problem (proc is > > > just not killable). If the proc was killable without preemption, > > > that would at least show your new code is better.. > > > > try the attached diff: > > > > This is worse.. > > Its worse in that the application never starts running fully, and that > it seems to ignore signals entirely. I can't attach a debugger to it > to see how far it got before hanging due to the signal problem. When > it hangs, (both before and after a signal is sent) the CPU utilization > is 0%.. Before its sent a signal, it looks like this: > > 573 c1f3b8c0 e88ae000 1387 517 573 000c082 (threaded) mx_pingpong > thread 0xc1f3e320 ksegrp 0xc19ead20 [RUNQ] > thread 0xc1f3e4b0 ksegrp 0xc19ead20 [RUNQ] > thread 0xc1f3e640 ksegrp 0xc19eaaf0 [SLPQ ksesigwait 0xc1f3b9c0][SLP] > > > db> call db_trace_thread(0xc1f3e320, -1) > sched_switch(c1f3e320,0,1,1862ccb2,994777d8) at sched_switch+0x137 > mi_switch(1,0,c05fdf59,804c000,c2b8c2ec) at mi_switch+0x1ce > turnstile_wait(c1a518c0,c06c53e0,c1a4d7d0,0,1) at turnstile_wait+0x339 > _mtx_lock_sleep(c06c53e0,c1f3e320,0,0,0) at _mtx_lock_sleep+0x122 > vm_fault(c187a5dc,804c000,1,0,0) at vm_fault+0x214 > trap_pfault(e88b8d48,1,804c800,3,804c800) at trap_pfault+0x136 > trap(2f,2f,2f,805d13c,805d13c) at trap+0x201 > calltrap() at calltrap+0x5 > --- trap 0xc, eip = 0x804c800, esp = 0xbfbfe66c, ebp = 0xbfbfe678 --- > 0 > > db> call db_trace_thread(0xc1f3e4b0, -1) > sched_switch(c1f3e4b0,0,1,f0007932,9935c3e9) at sched_switch+0x137 > mi_switch(1,0,c19ead60,e88bbc5c,c1f3e4b0) at mi_switch+0x1ce > sleepq_switch(c19ead60,c1f3e4b0,0,e88bbc94,c04e5da6) at sleepq_switch+0x171 > sleepq_timedwait_sig(c19ead60,0,c1f3b92c,c0677640,100) at sleepq_timedwait_sig+0x13 > msleep(c19ead60,c1f3b92c,168,c0677640,1771) at msleep+0x37b > kse_release(c1f3e4b0,e88bbd14,4,c04c47ab,0) at kse_release+0x29b > syscall(2f,2f,2f,8054200,0) at syscall+0x2fc > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (383, FreeBSD ELF32, kse_release), eip = 0x280a3d4f, esp = 0x8194f80, ebp = 0x8194fbc --- > 0 > > db> call db_trace_thread(0xc1f3e640, -1) > sched_switch(c1f3e640,0,1,bc7c14b2,97d6ec54) at sched_switch+0x137 > mi_switch(1,0,0,0,0) at mi_switch+0x1ce > sleepq_switch(c1f3b9c0,c1f3e640,0,e88bec94,c04e5da6) at sleepq_switch+0x171 > sleepq_timedwait_sig(c1f3b9c0,0,0,0,0) at sleepq_timedwait_sig+0x13 > msleep(c1f3b9c0,c1f3b92c,168,c0677635,bb9) at msleep+0x37b > kse_release(c1f3e640,e88bed14,4,c04c47ab,0) at kse_release+0x1a1 > syscall(2f,2f,2f,1,81) at syscall+0x2fc > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (383, FreeBSD ELF32, kse_release), eip = 0x280a3d4f, esp = 0xbfafef30, ebp = 0xbfafef8c --- > 0 > > > A different run, but after sending it a ^C from the command line: > > 547 c1f3b1c0 e88aa000 0 1 547 000c482 (threaded) mx_pingpong > thread 0xc1f3e960 ksegrp 0xc19eaee0 [RUNQ] > thread 0xc1f3eaf0 ksegrp 0xc19eaee0 [RUNQ] > thread 0xc1f3ec80 ksegrp 0xc19eab60 [SUSP] > > db> call db_trace_thread(0xc1f3e960, -1) > sched_switch(c1f3e960,0,2,e7ff39b6,d6d80c8c) at sched_switch+0x137 > mi_switch(2,0,0,0,0) at mi_switch+0x1ce > ast(e88c4d48) at ast+0x4eb > doreti_ast() at doreti_ast+0x17 > 0 > db> call db_trace_thread(0xc1f3eaf0, -1) > sched_switch(c1f3eaf0,0,1,6e2ca4e6,d6924d2f) at sched_switch+0x137 > mi_switch(1,0,c19eaf20,e88c7c5c,c1f3eaf0) at mi_switch+0x1ce > sleepq_switch(c19eaf20,c1f3eaf0,0,e88c7c94,c04e5da6) at sleepq_switch+0x171 > sleepq_timedwait_sig(c19eaf20,0,c1f3b22c,c0677640,100) at sleepq_timedwait_sig+0x13 > msleep(c19eaf20,c1f3b22c,168,c0677640,1771) at msleep+0x37b > kse_release(c1f3eaf0,e88c7d14,4,c04c47ab,0) at kse_release+0x29b > syscall(2f,2f,2f,8054200,0) at syscall+0x2fc > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (383, FreeBSD ELF32, kse_release), eip = 0x280a3d4f, esp = 0x8194f80, ebp = 0x8194fbc --- > 0 > db> call db_trace_thread(0xc1f3ec80, -1) > sched_switch(c1f3ec80,0,1,26e24232,4249ca0b) at sched_switch+0x137 > mi_switch(1,0,0,0,0) at mi_switch+0x1ce > thread_single(1,c1f3ec80,c1f3b1c0,e88cac5c,c0500581) at thread_single+0x1d7 > exit1(c1f3ec80,2,e88cacb8,c04f1736,0) at exit1+0x115 > expand_name(c1f3ec80,2,c1f3ec80,e88cad48,0) at expand_name > kse_thr_interrupt(c1f3ec80,e88cad14,c,c1f3ec80,e88cad3c) at kse_thr_interrupt+0x329 > syscall(2f,2f,2f,8054100,805a800) at syscall+0x2fc > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (382, FreeBSD ELF32, kse_thr_interrupt), eip = 0x280a3d6f, esp = 0xbfafee60, ebp = 0xbfafeefc --- > 0 > > > If you want line number translations, please let me know. I saved the > kernel that this came from and also took a dump. > > Drew