Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Jul 2007 20:56:26 -0700
From:      Mark Peek <mp@FreeBSD.org>
To:        Doug White <dwhite@gumbysoft.com>
Cc:        tcsh-bugs@mx.gw.com, current@freebsd.org
Subject:   Re: tcsh backtick hang info
Message-ID:  <4695A66A.8030903@FreeBSD.org>
In-Reply-To: <20070711191310.M90716@carver.gumbysoft.com>

index | next in thread | previous in thread | raw e-mail

On 7/11/07 7:28 PM, Doug White wrote:
> (note: freebsd-current@freebsd.org and tcsh-bugs@mx.gw.com are in the 
> To: on this message. Restrict replies accordingly.)
> 
> Hey folks,
> 
> I spent several hours today pawing through the tcsh source in an effort 
> to figure out whats going on with tcsh hangs with backticked commands in 
> tcsh 6.15.00.
> 
> The canonical example is something like:
> 
> kill `ps ax | grep foo | awk '{print $1}'`
> 
> where a builtin gets its arguments from a backticked expression composed 
> of non-builtins.
> 
> tcsh 6.15.00 introduced a new reference-counted signal management 
> facility where, instead of manipulating the signal mask directly, 
> functions increment a variable that is polled to see whether to perform 
> the action associated with SIGINT, SIGCHLD, SIGALRM, or SIGHUP.  The 
> signal handler function itself sets a pending flag for each named signal 
> and returns, so only a few instructions are executed in signal context.  
> At some future point the pending flags are polled by a call to 
> handle_pending_signals(), usually in a loop where the shell goes to 
> sleep waiting for an external action to occur. When the function no 
> longer needs the signal to be blocked it decrements the count via a 
> stack of cleanup handlers. When a count reaches zero then a poll is 
> immediately triggered.
> 
> If the disabled count is >1 for a signal when a handle_pending_signals() 
> poll occurs, then the signal is not "handled".
> 
> In the case above, the disabled count for SIGCHLD is 1 when SIGCHLD 
> fires from the completion of the backticked commands. The sigsuspend() 
> in pjwait() is correctly woken up by the kernel but, because the 
> disabled count is 1, the shell goes back into sigsuspend() and appears 
> to hang.
> 
> In this case it appears to be an improperly placed bump to the SIGCHLD 
> disable count that is held over a call to pjwait(). I haven't yet 
> determined the call stack (and gdb cannot debug tcsh at the moment) so I 
> need to continue instrumenting the code to figure out what higher level 
> function is disabling SIGCHLD and then calling something that eventually 
> calls pjwait().
> 

There appears to be two different issues. One is with the builtin kill and the 
other is the gdb issue. I sent off a tentative patch to the reporter of the 
builtin kill issue and am awaiting onfirmation. The patch is here:

http://people.freebsd.org/~mp/tcsh_kill.patch

The gdb issue, much to my dismay, is still alluding my debugging skill given 
the interaction with gdb and issues with actually debugging what is happening.

Mark


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4695A66A.8030903>