Date: Wed, 11 Jul 2007 19:28:55 -0700 (PDT) From: Doug White <dwhite@gumbysoft.com> To: current@freebsd.org, tcsh-bugs@mx.gw.com Subject: tcsh backtick hang info Message-ID: <20070711191310.M90716@carver.gumbysoft.com>
next in thread | raw e-mail | index | archive | help
(note: freebsd-current@freebsd.org and tcsh-bugs@mx.gw.com are in the To: on this message. Restrict replies accordingly.) Hey folks, I spent several hours today pawing through the tcsh source in an effort to figure out whats going on with tcsh hangs with backticked commands in tcsh 6.15.00. The canonical example is something like: kill `ps ax | grep foo | awk '{print $1}'` where a builtin gets its arguments from a backticked expression composed of non-builtins. tcsh 6.15.00 introduced a new reference-counted signal management facility where, instead of manipulating the signal mask directly, functions increment a variable that is polled to see whether to perform the action associated with SIGINT, SIGCHLD, SIGALRM, or SIGHUP. The signal handler function itself sets a pending flag for each named signal and returns, so only a few instructions are executed in signal context. At some future point the pending flags are polled by a call to handle_pending_signals(), usually in a loop where the shell goes to sleep waiting for an external action to occur. When the function no longer needs the signal to be blocked it decrements the count via a stack of cleanup handlers. When a count reaches zero then a poll is immediately triggered. If the disabled count is >1 for a signal when a handle_pending_signals() poll occurs, then the signal is not "handled". In the case above, the disabled count for SIGCHLD is 1 when SIGCHLD fires from the completion of the backticked commands. The sigsuspend() in pjwait() is correctly woken up by the kernel but, because the disabled count is 1, the shell goes back into sigsuspend() and appears to hang. In this case it appears to be an improperly placed bump to the SIGCHLD disable count that is held over a call to pjwait(). I haven't yet determined the call stack (and gdb cannot debug tcsh at the moment) so I need to continue instrumenting the code to figure out what higher level function is disabling SIGCHLD and then calling something that eventually calls pjwait(). -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070711191310.M90716>