Date: Wed, 11 Jul 2007 20:56:26 -0700 From: Mark Peek <mp@FreeBSD.org> To: Doug White <dwhite@gumbysoft.com> Cc: tcsh-bugs@mx.gw.com, current@freebsd.org Subject: Re: tcsh backtick hang info Message-ID: <4695A66A.8030903@FreeBSD.org> In-Reply-To: <20070711191310.M90716@carver.gumbysoft.com>
index | next in thread | previous in thread | raw e-mail
On 7/11/07 7:28 PM, Doug White wrote:
> (note: freebsd-current@freebsd.org and tcsh-bugs@mx.gw.com are in the
> To: on this message. Restrict replies accordingly.)
>
> Hey folks,
>
> I spent several hours today pawing through the tcsh source in an effort
> to figure out whats going on with tcsh hangs with backticked commands in
> tcsh 6.15.00.
>
> The canonical example is something like:
>
> kill `ps ax | grep foo | awk '{print $1}'`
>
> where a builtin gets its arguments from a backticked expression composed
> of non-builtins.
>
> tcsh 6.15.00 introduced a new reference-counted signal management
> facility where, instead of manipulating the signal mask directly,
> functions increment a variable that is polled to see whether to perform
> the action associated with SIGINT, SIGCHLD, SIGALRM, or SIGHUP. The
> signal handler function itself sets a pending flag for each named signal
> and returns, so only a few instructions are executed in signal context.
> At some future point the pending flags are polled by a call to
> handle_pending_signals(), usually in a loop where the shell goes to
> sleep waiting for an external action to occur. When the function no
> longer needs the signal to be blocked it decrements the count via a
> stack of cleanup handlers. When a count reaches zero then a poll is
> immediately triggered.
>
> If the disabled count is >1 for a signal when a handle_pending_signals()
> poll occurs, then the signal is not "handled".
>
> In the case above, the disabled count for SIGCHLD is 1 when SIGCHLD
> fires from the completion of the backticked commands. The sigsuspend()
> in pjwait() is correctly woken up by the kernel but, because the
> disabled count is 1, the shell goes back into sigsuspend() and appears
> to hang.
>
> In this case it appears to be an improperly placed bump to the SIGCHLD
> disable count that is held over a call to pjwait(). I haven't yet
> determined the call stack (and gdb cannot debug tcsh at the moment) so I
> need to continue instrumenting the code to figure out what higher level
> function is disabling SIGCHLD and then calling something that eventually
> calls pjwait().
>
There appears to be two different issues. One is with the builtin kill and the
other is the gdb issue. I sent off a tentative patch to the reporter of the
builtin kill issue and am awaiting onfirmation. The patch is here:
http://people.freebsd.org/~mp/tcsh_kill.patch
The gdb issue, much to my dismay, is still alluding my debugging skill given
the interaction with gdb and issues with actually debugging what is happening.
Mark
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4695A66A.8030903>
