Date: Tue, 13 Aug 2002 04:58:17 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: David Xu <bsddiy@yahoo.com> Cc: "Andrey A. Chernov" <ache@nagual.pp.ru>, Julian Elischer <julian@elischer.org>, FreeBSD CURRENT <freebsd-current@FreeBSD.ORG> Subject: Re: cvs commit: src/sys/kern kern_sig.c (fwd) Message-ID: <20020813031722.T25520-100000@gamplex.bde.org> In-Reply-To: <20020812055315.7682.qmail@web20907.mail.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 11 Aug 2002, David Xu wrote:
> following is patch for su, I can type "suspend" and stop $$ without the
> problem you described, I have tested it under tcsh and bash, all works
> for me.
>
> --- su.c Mon Aug 12 13:08:01 2002
> +++ su.c.new Mon Aug 12 13:16:14 2002
> @@ -329,10 +329,13 @@
> default:
> while ((ret_pid = waitpid(child_pid, &statusp, WUNTRACED)) != -1) {
> if (WIFSTOPPED(statusp)) {
> - child_pgrp = tcgetpgrp(1);
> kill(getpid(), SIGSTOP);
> - tcsetpgrp(1, child_pgrp);
> - kill(child_pid, SIGCONT);
> + child_pgrp = getpgid(child_pid);
> + if (tcgetpgrp(1) == getpgrp())
> + {
> + tcsetpgrp(1, child_pgrp);
> + kill(child_pid, SIGCONT);
> + }
> statusp = 1;
> continue;
> }
Explanation of this patch:
(1) su has shot itself in the foot using PAM. Normally the parent shell
waits for children and handles them when they stop. The extra process
for PAM is now in between the parent shell and the su shell, so the
parent shell can't do this directly. The above code attempts to
relay some aspects of job control back to the parent shell. It is
not clear that it can do this properly without duplicating lots of
shell specific job control, but I think it can do this in principle.
There are related problems for propagation of SIGHUP to indirect
descendants of login shells when the shell exits. Here there is
at least there is an intermediate process that can relay the signals
if necessary. I think propagation of SIGHUP is automatic if the
intermediate process doesn't exit first and it doesn't change its
job control stuff too much, so the SIGHUP problem doesn't affect
PAMmed applications.
(2) To relay SIGSTOP, the intermediate su just needs stop itself. To
relay SIGCONT, the intermediate su needs to switch to enough of
its child's job control environment before starting the child.
Switching only fd 1's process group seems to be sufficient, but
it is not easy to determine even that and the broken version got
it wrong.
The child's environment is very shell-dependent. Some of the following
may depend on the initial shell being bash:
(a) sh, csh and bash start a new process group (equal to their pid).
zsh stays in the process group of the intermediate su process.
(b) "kill -STOP $$ ... fg" worked in most (all?) cases because
fd 1's pgrp is still the child's pgid when the child is killed
in that way. For zsh the child's pgid is the same as the
intermediate shell's so the pgrps can't be different, and for
the other shells I think the pgrp hasn't been changed because
the child can'tcontrol it (SIGSTOP is uncatchable) and the
kernel doesn't. Later, switching fd 1's pgrp back to the
child's pgid works except possbly for zsh because it is correct
and different.
(c) "suspend ... fg" failed for several reasons. First, something
(presumably the child) sets fd 1's pgrp to the intermediate
su's pgid, so tcgetpgrp(1) gives a wrong pgrp for restoring
later. The patch fixes this by not getting the pgrp in this
way. It uses getpgid(child_pid) instead. I think this works
for at least normal shells. Second when the pgrp is restored,
something (presumably the shell above the intermediate su, or
the kernel) has already switched fd 1's pgrp to child's pgid
instead of to the intermediate su's pgid (despite the intermediate
su's being correct at SIGSTOP time for suspend but not for
kill -STOP). Setting fd 1's pgrp to the value that it alread
has is then fatal for reasons that I don't completely understand
yet. The patch avoids the problem by not doing apparently-null
tcsetpgrp()'s. Sending the SIGCONT seems to have no affect in
this case, so I think shell above the su's has already started
both the child su and the intermediate one and this isn't a
problem until the su's get in each other's way. Putting printfs
in the above code seems to make the problem easier to debug by
ensuring that they get in each other's way :-).
Bruce
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020813031722.T25520-100000>
