Date: Fri, 8 Mar 2002 20:26:41 -0500 (EST) From: Robert Watson <rwatson@freebsd.org> To: David Wolfskill <david@catwhisker.org> Cc: current@freebsd.org Subject: Re: Keyboard (quasi-)lockup running script Message-ID: <Pine.NEB.3.96L.1020308202518.91006J-100000@fledge.watson.org> In-Reply-To: <Pine.NEB.3.96L.1020308200645.91006I-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
FYI, process 255 here is csh. In this case, getty isn't also blocked in
trying to get pgrpsess:
db> ps
pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
255 c9735300 c9755000 0 254 255 0006002 3 ttywai c1b2ea74 csh
254 c9734d00 c9771000 0 248 254 0004002 3 pgrpsess c03dc0fc script
248 c8cf2300 c972e000 0 247 248 2004002 3 pause c972e000 csh
247 c8cf2600 c9728000 0 156 156 0000000 3 select c03de548 sshd
188 c9735f00 c9749000 0 1 188 0004002 3 ttyin c18be410 getty
187 c9736200 c9744000 0 1 187 0004002 3 ttyin c1b06410 getty
So basically, csh grabs the exclusive lock as it exits, then calls
ttywait(). Then csh and other processes stack up on pgrpsess.
Robert N M Watson FreeBSD Core Team, TrustedBSD Project
robert@fledge.watson.org NAI Labs, Safeport Network Services
On Fri, 8 Mar 2002, Robert Watson wrote:
> Found it:
>
> db> show locks 255
> exclusive (sx) pgrpsess (0xc03dc0e0) locked @
> /cboss/p4/rwatson/trustedbsd/mac/sys/kern/kern_exit.c:234
> db> trace 255
> mi_switch(0,c1b2ea28,c9735300,c0232341,1) at mi_switch+0x17c
> msleep(c1b2ea74,0,15a,c038743f,7530) at msleep+0x31a
> ttysleep(c1b2ea28,c1b2ea74,15a,c038743f,7530) at ttysleep+0x21
> ttywait(c1b2ea28) at ttywait+0x45
> exit1(c9735400,0,c03f3260,0,c037df20) at exit1+0x43e
> sys_exit(c9735400,c977fd20,1,0,0) at sys_exit+0x2a
> syscall(2f,2f,2f,0,0) at syscall+0x21f
> syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> --- syscall (1, FreeBSD ELF, sys_exit), eip = 0x80a5c8f, esp = 0xbfbfda14, ebp = 0xbfbfda40 ---
>
> I didn't realize 'show locks' used curthread by default only. There
> doesn't appear to be a 'show locks all'. In any case, it looks like the
> exclusive lock is grabbed in kern_exit:
>
> PGRPSESS_XLOCK();
> if (SESS_LEADER(p)) {
> ...
> PGRP_UNLOCK(sp->s_ttyp->t_pgrp);
> }
> /* XXX tp should be locked. */
> (void) ttywait(tp);
> /*
> * The tty could have been revoked
> * if we blocked.
>
>
> Robert N M Watson FreeBSD Core Team, TrustedBSD Project
> robert@fledge.watson.org NAI Labs, Safeport Network Services
>
> On Fri, 8 Mar 2002, Robert Watson wrote:
>
> >
> > Some more details, now I have WITNESS compiled in...
> >
> > crash1# script ls
> > Script started, output file is ls
> > crash1#
> > crash1# exitStopped at siointr1+0xf2: movl $0,brk_state1.702
> > db> trace
> > siointr1(c185f400,c0446ec0,0,c03a0260,662) at siointr1+0xf2
> > siointr(c185f400,18,c86c0010,c0230010,0) at siointr+0x23
> > Xfastintr4() at Xfastintr4+0x32
> > --- interrupt, eip = 0xc0329e52, esp = 0xc86cad14, ebp = 0xc86cad14 ---
> > cpu_idle(c86cad34,c0228798,0,c86cad48,0) at cpu_idle+0x1e
> > idle_proc(0,c86cad48,0,c02288f4,0) at idle_proc+0x21
> > fork_exit(c02288f4,0,c86cad48) at fork_exit+0x98
> > fork_trampoline() at fork_trampoline+0x8
> > db> ps
> > pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
> > 193 c9722c00 c975a000 0 192 193 0006002 3 ttywai c1b1c674 csh
> > 192 c8cf2600 c9728000 0 189 192 0004002 3 pgrpsess c03dc0fc script
> > 189 c8cf2300 c972e000 0 188 189 2004002 3 pause c972e000 csh
> > 188 c9722f00 c9749000 0 1 188 0004102 3 wait c9722f00 login
> > ...
> > db> trace 192
> > mi_switch(c03dc0e0,c03dc0fc,78a,c0385a20,14f) at mi_switch+0x17c
> > cv_wait(c03dc0fc,c0413564,c8cf2600,c18be400,0) at cv_wait+0x1fb
> > _sx_slock(c03dc0e0,c03873e0,78a,0,c91b78a0) at _sx_slock+0x5c
> > ttwrite(c18be400,c972bc90,6e0001,c0410360,c972bbf8) at ttwrite+0xa4
> > siowrite(c0410360,c972bc90,6e0001,c0410360,3) at siowrite+0x78
> > spec_write(c972bc24,c972bc38,c027d9a7,c972bc24,3) at spec_write+0x5d
> > spec_vnoperate(c972bc24,3,c972bd20,0,1) at spec_vnoperate+0x15
> > vn_write(c1b1bd40,c972bc90,c1b1fb00,0,c8cf2700) at vn_write+0x1a3
> > dofilewrite(c8cf2700,c1b1bd40,1,bfbff790,3) at dofilewrite+0xb6
> > write(c8cf2700,c972bd20,0,4,11) at write+0x51
> > syscall(2f,2f,2f,11,4) at syscall+0x21f
> > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> > --- syscall (4, FreeBSD ELF, write), eip = 0x280c3fd7, esp = 0xbfbff2cc, ebp = 0xbfbffc18 ---
> > db> show locks
> > exclusive (spin mutex) sio (0xc0446ec0) locked @
> > /cboss/p4/rwatson/trustedbsd/mac/sys/dev/sio/sio.c:1634
> >
> > I suspect, unfortunately, that that is only my kernel debugger entry
> > point. So, here's what happens if I do it using a pty for the script
> > session, keeping the kernel debugger out of the picture:
> >
> > (pty0)
> > crash1# script ls
> > Script started, output file is ls
> > crash1# exit
> > (serial console)
> > pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
> > 194 c973a900 c975f000 0 193 194 0006002 3 ttywai c1b28674 csh
> > 193 c8cf2900 c971b000 0 190 193 0004002 3 pgrpsess c03dc0fc script
> > 190 c8cf2600 c9728000 0 189 190 2004002 3 pause c9728000 csh
> > 189 c8cf2300 c972e000 0 156 156 0000000 3 select c03de548 sshd
> > 188 c973ac00 c9754000 0 1 188 0004002 3 pgrpsess c03dc0fc getty
> > 187 c973af00 c974f000 0 1 187 0004002 3 ttyin c1b0b610 getty
> > ...
> > db> trace 193 # script
> > mi_switch(c03dc0e0,c03dc0fc,78a,c0385a20,14f) at mi_switch+0x17c
> > cv_wait(c03dc0fc,c0413564,c8cf2900,c1b0b228,0) at cv_wait+0x1fb
> > _sx_slock(c03dc0e0,c03873e0,78a,c1b1c600,c96f3620) at _sx_slock+0x5c
> > ttwrite(c1b0b228,c971ec90,620001,c971ebf8,c0212d79) at ttwrite+0xa4
> > ptswrite(c1b1c600,c971ec90,620001,c1b1c600,3) at ptswrite+0x28
> > spec_write(c971ec24,c971ec38,c027d9a7,c971ec24,3) at spec_write+0x5d
> > spec_vnoperate(c971ec24,3,c971ed20,0,1) at spec_vnoperate+0x15
> > vn_write(c1af2100,c971ec90,c0b2db00,0,c8cf2a00) at vn_write+0x1a3
> > dofilewrite(c8cf2a00,c1af2100,1,bfbff73c,3) at dofilewrite+0xb6
> > write(c8cf2a00,c971ed20,0,4,11) at write+0x51
> > syscall(2f,2f,2f,11,4) at syscall+0x21f
> > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> > --- syscall (4, FreeBSD ELF, write), eip = 0x280c3fd7, esp = 0xbfbff278, ebp = 0xbfbffbc4 ---
> > db> trace 188 # getty
> > mi_switch(c03dc0e0,c03dc0fc,78a,c0385a20,14f) at mi_switch+0x17c
> > cv_wait(c03dc0fc,c0413564,c973ac00,c18be400,0) at cv_wait+0x1fb
> > _sx_slock(c03dc0e0,c03873e0,78a,0,c91b78a0) at _sx_slock+0x5c
> > ttwrite(c18be400,c9757c90,40001,c0410360,c9757bf8) at ttwrite+0xa4
> > siowrite(c0410360,c9757c90,40001,c0410360,1) at siowrite+0x78
> > spec_write(c9757c24,c9757c38,c027d9a7,c9757c24,1) at spec_write+0x5d
> > spec_vnoperate(c9757c24,1,c9757d20,0,1) at spec_vnoperate+0x15
> > vn_write(c1b1bf80,c9757c90,c1b1cc00,0,c973ad00) at vn_write+0x1a3
> > dofilewrite(c973ad00,c1b1bf80,1,8050680,1) at dofilewrite+0xb6
> > write(c973ad00,c9757d20,bfbffde4,805072c,80507cc) at write+0x51
> > syscall(2f,2f,2f,80507cc,805072c) at syscall+0x21f
> > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> > --- syscall (4, FreeBSD ELF, write), eip = 0x280c7fd7, esp = 0xbfbffcf0, ebp = 0xbfbffd0c ---
> >
> > So perhaps there's a deadlock. Or at least, it looks like both processes
> > hang trying to get a shared lock for the session, and block waiting.
> > Perhaps the lock is getting corrupted somehow, or has not properly been
> > initialized?
> >
> > Unfortunately, it looks like 'show locks' only knows about the mutex in
> > the sio code, presumably from the serial debugger:
> >
> > db> show locks
> > exclusive (spin mutex) sio (0xc0446ec0) locked @ /cboss/p4/rwatson/trustedbsd/mac/sys/dev/sio/sio.c:1634
> >
> > Does 'show locks' not show sxlocks? I was pretty sure it did...
> >
> > Robert N M Watson FreeBSD Core Team, TrustedBSD Project
> > robert@fledge.watson.org NAI Labs, Safeport Network Services
> >
> >
> > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > with "unsubscribe freebsd-current" in the body of the message
> >
>
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-current" in the body of the message
>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1020308202518.91006J-100000>
