Date: Tue, 31 Oct 2000 08:09:30 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: jandrese@mitre.org (Andresen, Jason R.) Cc: jgreco@ns.sol.net (Joe Greco), gjb@gbch.net (Greg Black), hackers@FreeBSD.ORG, ryan@sasknow.com, andrew@ugh.net.au Subject: Re: Logging users out Message-ID: <200010310809.BAA27414@usr02.primenet.com> In-Reply-To: <39FDE4D7.1020C4B2@mitre.org> from "Andresen,Jason R." at Oct 30, 2000 04:15:03 PM
next in thread | previous in thread | raw e-mail | index | archive | help
> > Uh, well, "foolproof" != "calling ps and awk and grep and looking for > > processes". For ANY definition of foolproof. > > > > And it is certainly foolproof from the point of view that there's no way > > in hell for the session not to be terminated, unlike some ps garbage I've > > seen. > > Unfortunatly, sometimes when processes suddenly lose stdin/stdout, > they jump into infinate loops and start eating cpu cycles like > crazy. I'd hate to see what happens when you kill off a > significant number of people running these poorly behaved programs. > FVWM95 Taskbars used to be notorious for this, I remember seeing > upwards of a dozen of them vying for CPU time on some lab > machines. This is because the FreeBSD tty revocation code is broken, though in technical compliance with POSIX. The way it's supposed to happen is that "hupcl" (hangup on close) is supposed to be set on the tty, and the signal is supposed to be sent to the process group leader, so it can be trampolined. Being a group signal, not a process signal, it will be delivered to all children of the leader, as well -- just as group signal delivery has been supposed to work for forever. Only by the time it gets there, there aren't any children technically in the group any more. What happens in the revoke code is that effectively, everyone is made into a process group leader, so the SIGHUP to the process group leader is not properly propagated to the other processes in the group. The correct order of operation is to revoke, promote, then signal, which would result in the SIGHUP being delivered to all processes which have not explicitly blocked it. FreeBSD does revoke, signal, then promote, which means the newly promotes processes aren't in the signalled process group by the time the SIGHUP is delivered. Traditional BSD (and UNIX) behaviour actually iteratively did the revocation of the controlling tty on a per process basis, after signal delivery, but the global "revoke" changed things. This change went in during the POSIX-me-harder tournament, early in FreeBSD's infancy. Before it was POSIX-ized, SIGHUP was correctly delivered on hangup to _all_ processes for which the tty was the controlling tty. After it went in, we started having runaway processes, which were then labelled as being "broken" for not noticing that read was returning 0 (which is returned on EOF, but is also returned on perfectly goo non-blocking fds, and in the case that vmin is set to zero to effect a timed poll via vtime). Yeah, I basically replay this broken record any time someone tries to blame the application for not getting out the Ouiji board and trying to contact the dear departed tty, since there are actually people who really do use non-blocking fds and vmin/vtime to do things like user space threads and background computation while waiting for user input. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200010310809.BAA27414>