From owner-freebsd-current Fri Mar 8 17:27:20 2002 Delivered-To: freebsd-current@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 6A6EA37B41E for ; Fri, 8 Mar 2002 17:26:52 -0800 (PST) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.6/8.11.5) with SMTP id g291QfD96962; Fri, 8 Mar 2002 20:26:41 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Fri, 8 Mar 2002 20:26:41 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: David Wolfskill Cc: current@freebsd.org Subject: Re: Keyboard (quasi-)lockup running script In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG FYI, process 255 here is csh. In this case, getty isn't also blocked in trying to get pgrpsess: db> ps pid proc addr uid ppid pgrp flag stat wmesg wchan cmd 255 c9735300 c9755000 0 254 255 0006002 3 ttywai c1b2ea74 csh 254 c9734d00 c9771000 0 248 254 0004002 3 pgrpsess c03dc0fc script 248 c8cf2300 c972e000 0 247 248 2004002 3 pause c972e000 csh 247 c8cf2600 c9728000 0 156 156 0000000 3 select c03de548 sshd 188 c9735f00 c9749000 0 1 188 0004002 3 ttyin c18be410 getty 187 c9736200 c9744000 0 1 187 0004002 3 ttyin c1b06410 getty So basically, csh grabs the exclusive lock as it exits, then calls ttywait(). Then csh and other processes stack up on pgrpsess. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services On Fri, 8 Mar 2002, Robert Watson wrote: > Found it: > > db> show locks 255 > exclusive (sx) pgrpsess (0xc03dc0e0) locked @ > /cboss/p4/rwatson/trustedbsd/mac/sys/kern/kern_exit.c:234 > db> trace 255 > mi_switch(0,c1b2ea28,c9735300,c0232341,1) at mi_switch+0x17c > msleep(c1b2ea74,0,15a,c038743f,7530) at msleep+0x31a > ttysleep(c1b2ea28,c1b2ea74,15a,c038743f,7530) at ttysleep+0x21 > ttywait(c1b2ea28) at ttywait+0x45 > exit1(c9735400,0,c03f3260,0,c037df20) at exit1+0x43e > sys_exit(c9735400,c977fd20,1,0,0) at sys_exit+0x2a > syscall(2f,2f,2f,0,0) at syscall+0x21f > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b > --- syscall (1, FreeBSD ELF, sys_exit), eip = 0x80a5c8f, esp = 0xbfbfda14, ebp = 0xbfbfda40 --- > > I didn't realize 'show locks' used curthread by default only. There > doesn't appear to be a 'show locks all'. In any case, it looks like the > exclusive lock is grabbed in kern_exit: > > PGRPSESS_XLOCK(); > if (SESS_LEADER(p)) { > ... > PGRP_UNLOCK(sp->s_ttyp->t_pgrp); > } > /* XXX tp should be locked. */ > (void) ttywait(tp); > /* > * The tty could have been revoked > * if we blocked. > > > Robert N M Watson FreeBSD Core Team, TrustedBSD Project > robert@fledge.watson.org NAI Labs, Safeport Network Services > > On Fri, 8 Mar 2002, Robert Watson wrote: > > > > > Some more details, now I have WITNESS compiled in... > > > > crash1# script ls > > Script started, output file is ls > > crash1# > > crash1# exitStopped at siointr1+0xf2: movl $0,brk_state1.702 > > db> trace > > siointr1(c185f400,c0446ec0,0,c03a0260,662) at siointr1+0xf2 > > siointr(c185f400,18,c86c0010,c0230010,0) at siointr+0x23 > > Xfastintr4() at Xfastintr4+0x32 > > --- interrupt, eip = 0xc0329e52, esp = 0xc86cad14, ebp = 0xc86cad14 --- > > cpu_idle(c86cad34,c0228798,0,c86cad48,0) at cpu_idle+0x1e > > idle_proc(0,c86cad48,0,c02288f4,0) at idle_proc+0x21 > > fork_exit(c02288f4,0,c86cad48) at fork_exit+0x98 > > fork_trampoline() at fork_trampoline+0x8 > > db> ps > > pid proc addr uid ppid pgrp flag stat wmesg wchan cmd > > 193 c9722c00 c975a000 0 192 193 0006002 3 ttywai c1b1c674 csh > > 192 c8cf2600 c9728000 0 189 192 0004002 3 pgrpsess c03dc0fc script > > 189 c8cf2300 c972e000 0 188 189 2004002 3 pause c972e000 csh > > 188 c9722f00 c9749000 0 1 188 0004102 3 wait c9722f00 login > > ... > > db> trace 192 > > mi_switch(c03dc0e0,c03dc0fc,78a,c0385a20,14f) at mi_switch+0x17c > > cv_wait(c03dc0fc,c0413564,c8cf2600,c18be400,0) at cv_wait+0x1fb > > _sx_slock(c03dc0e0,c03873e0,78a,0,c91b78a0) at _sx_slock+0x5c > > ttwrite(c18be400,c972bc90,6e0001,c0410360,c972bbf8) at ttwrite+0xa4 > > siowrite(c0410360,c972bc90,6e0001,c0410360,3) at siowrite+0x78 > > spec_write(c972bc24,c972bc38,c027d9a7,c972bc24,3) at spec_write+0x5d > > spec_vnoperate(c972bc24,3,c972bd20,0,1) at spec_vnoperate+0x15 > > vn_write(c1b1bd40,c972bc90,c1b1fb00,0,c8cf2700) at vn_write+0x1a3 > > dofilewrite(c8cf2700,c1b1bd40,1,bfbff790,3) at dofilewrite+0xb6 > > write(c8cf2700,c972bd20,0,4,11) at write+0x51 > > syscall(2f,2f,2f,11,4) at syscall+0x21f > > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b > > --- syscall (4, FreeBSD ELF, write), eip = 0x280c3fd7, esp = 0xbfbff2cc, ebp = 0xbfbffc18 --- > > db> show locks > > exclusive (spin mutex) sio (0xc0446ec0) locked @ > > /cboss/p4/rwatson/trustedbsd/mac/sys/dev/sio/sio.c:1634 > > > > I suspect, unfortunately, that that is only my kernel debugger entry > > point. So, here's what happens if I do it using a pty for the script > > session, keeping the kernel debugger out of the picture: > > > > (pty0) > > crash1# script ls > > Script started, output file is ls > > crash1# exit > > (serial console) > > pid proc addr uid ppid pgrp flag stat wmesg wchan cmd > > 194 c973a900 c975f000 0 193 194 0006002 3 ttywai c1b28674 csh > > 193 c8cf2900 c971b000 0 190 193 0004002 3 pgrpsess c03dc0fc script > > 190 c8cf2600 c9728000 0 189 190 2004002 3 pause c9728000 csh > > 189 c8cf2300 c972e000 0 156 156 0000000 3 select c03de548 sshd > > 188 c973ac00 c9754000 0 1 188 0004002 3 pgrpsess c03dc0fc getty > > 187 c973af00 c974f000 0 1 187 0004002 3 ttyin c1b0b610 getty > > ... > > db> trace 193 # script > > mi_switch(c03dc0e0,c03dc0fc,78a,c0385a20,14f) at mi_switch+0x17c > > cv_wait(c03dc0fc,c0413564,c8cf2900,c1b0b228,0) at cv_wait+0x1fb > > _sx_slock(c03dc0e0,c03873e0,78a,c1b1c600,c96f3620) at _sx_slock+0x5c > > ttwrite(c1b0b228,c971ec90,620001,c971ebf8,c0212d79) at ttwrite+0xa4 > > ptswrite(c1b1c600,c971ec90,620001,c1b1c600,3) at ptswrite+0x28 > > spec_write(c971ec24,c971ec38,c027d9a7,c971ec24,3) at spec_write+0x5d > > spec_vnoperate(c971ec24,3,c971ed20,0,1) at spec_vnoperate+0x15 > > vn_write(c1af2100,c971ec90,c0b2db00,0,c8cf2a00) at vn_write+0x1a3 > > dofilewrite(c8cf2a00,c1af2100,1,bfbff73c,3) at dofilewrite+0xb6 > > write(c8cf2a00,c971ed20,0,4,11) at write+0x51 > > syscall(2f,2f,2f,11,4) at syscall+0x21f > > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b > > --- syscall (4, FreeBSD ELF, write), eip = 0x280c3fd7, esp = 0xbfbff278, ebp = 0xbfbffbc4 --- > > db> trace 188 # getty > > mi_switch(c03dc0e0,c03dc0fc,78a,c0385a20,14f) at mi_switch+0x17c > > cv_wait(c03dc0fc,c0413564,c973ac00,c18be400,0) at cv_wait+0x1fb > > _sx_slock(c03dc0e0,c03873e0,78a,0,c91b78a0) at _sx_slock+0x5c > > ttwrite(c18be400,c9757c90,40001,c0410360,c9757bf8) at ttwrite+0xa4 > > siowrite(c0410360,c9757c90,40001,c0410360,1) at siowrite+0x78 > > spec_write(c9757c24,c9757c38,c027d9a7,c9757c24,1) at spec_write+0x5d > > spec_vnoperate(c9757c24,1,c9757d20,0,1) at spec_vnoperate+0x15 > > vn_write(c1b1bf80,c9757c90,c1b1cc00,0,c973ad00) at vn_write+0x1a3 > > dofilewrite(c973ad00,c1b1bf80,1,8050680,1) at dofilewrite+0xb6 > > write(c973ad00,c9757d20,bfbffde4,805072c,80507cc) at write+0x51 > > syscall(2f,2f,2f,80507cc,805072c) at syscall+0x21f > > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b > > --- syscall (4, FreeBSD ELF, write), eip = 0x280c7fd7, esp = 0xbfbffcf0, ebp = 0xbfbffd0c --- > > > > So perhaps there's a deadlock. Or at least, it looks like both processes > > hang trying to get a shared lock for the session, and block waiting. > > Perhaps the lock is getting corrupted somehow, or has not properly been > > initialized? > > > > Unfortunately, it looks like 'show locks' only knows about the mutex in > > the sio code, presumably from the serial debugger: > > > > db> show locks > > exclusive (spin mutex) sio (0xc0446ec0) locked @ /cboss/p4/rwatson/trustedbsd/mac/sys/dev/sio/sio.c:1634 > > > > Does 'show locks' not show sxlocks? I was pretty sure it did... > > > > Robert N M Watson FreeBSD Core Team, TrustedBSD Project > > robert@fledge.watson.org NAI Labs, Safeport Network Services > > > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > > with "unsubscribe freebsd-current" in the body of the message > > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-current" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message