From owner-freebsd-bugs Thu Apr 1 15:20:20 1999 Delivered-To: freebsd-bugs@freebsd.org Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (Postfix) with ESMTP id 48E9714D58 for ; Thu, 1 Apr 1999 15:20:18 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.9.2/8.9.2) id PAA11548; Thu, 1 Apr 1999 15:20:02 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: from finch-post-11.mail.demon.net (finch-post-11.mail.demon.net [194.217.242.39]) by hub.freebsd.org (Postfix) with ESMTP id 639DB155F7 for ; Thu, 1 Apr 1999 15:12:28 -0800 (PST) (envelope-from dmlb@ragnet.demon.co.uk) Received: from [158.152.46.40] (helo=ragnet.demon.co.uk) by finch-post-11.mail.demon.net with smtp (Exim 2.12 #1) id 10Sqd6-000Eus-0B for FreeBSD-gnats-submit@freebsd.org; Thu, 1 Apr 1999 23:12:08 +0000 Received: from dmlb by ragnet.demon.co.uk with local (Exim 1.82 #1) id 10Spfw-0000Je-00; Thu, 1 Apr 1999 23:11:00 +0100 Message-Id: Date: Thu, 1 Apr 1999 23:11:00 +0100 From: dmlb@ragnet.demon.co.uk Reply-To: dmlb@ragnet.demon.co.uk To: FreeBSD-gnats-submit@freebsd.org Cc: dmlb@ragnet.demon.co.uk X-Send-Pr-Version: 3.2 Subject: bin/10912: Problems with job control in bin/sh and fix. Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 10912 >Category: bin >Synopsis: Fix to prevent infinite loops on missing children >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Apr 1 15:20:01 PST 1999 >Closed-Date: >Last-Modified: >Originator: Duncan Barclay >Release: FreeBSD 2.2.6-RELEASE i386 >Organization: >Environment: 3.1 release used for testing. Full CVS repository available locally. >Description: Code in jobs.c dowait() fails when a child is reaped and breakwaitcmd is set. The shell thinks there is a job waiting to be reaped which has already exit'd. It then loops forever calling dowait() trying to find its lost child. This occurs when a sub-shell is invoked by a trap - the sub shell has breakwaitcmd set on entry to dowait() but the wait will also reap the child. dowait() exits because of breakwaitcmd and does not update the job status. /bin/sh before 1998 09 07 does not exhibit this behaviour, this is when background execution of traps was added. Example: In .env I have _winch(){ foo=$(stty -a | sed ....) .... } trap _winch 28 Turning on DEBUG and obseving trace output (I've added a getpid to trace() so I can follow this). Also I have changed onsig() in trap.c to set breakwaitcmd to the signal number so I can see it happen. PID 8679 is the backquote sub-shell. PID 8680 is the stty process. [8679] In parent shell: child = 8680 [8679] searchexec "sed" returns "/usr/bin/sed" [8679] forkshell(%0, 0x80aa4dc, 0) called [8679] In parent shell: child = 8681 [8679] waitforjob(%1) called *** 8679 waits for stty *** to finish [8679] dowait(1) called, breakwaitcmd = 128 *** trouble brewing [8680] Child shell 8680 [8680] evaltree(0x80aa4a4: 1) called [8680] evalcommand(0x80aa4a4, 1) called [8680] evalcommand arg: stty [8680] evalcommand arg: -a *** stty completes [8679] wait returns 8680, status=0 *** stty is reaped [8679] dowait returning because breakwaitcmd = 128 *** oh s**t!!!! I've "fixed" this by checking that in_waitcmd is set in the signal handler, but this may not be "right". I'm not entirely sure that the breakwaitcmd code is right in dowait() as it doesn't check that the wait returned due to a signal to this process. Patches below to add pid to trace logging and the in_waitcmd check. Index: sh/show.c =================================================================== RCS file: /ide0.e/ncvs/src/bin/sh/show.c,v retrieving revision 1.9 diff -u -r1.9 show.c --- show.c 1998/05/18 06:44:19 1.9 +++ show.c 1999/04/01 22:25:05 @@ -316,6 +316,7 @@ fmt = va_arg(va, char *); #endif if (tracefile != NULL) { + (void) fprintf(tracefile, "[%d] ", getpid()); (void) vfprintf(tracefile, fmt, va); if (strchr(fmt, '\n')) (void) fflush(tracefile); Index: sh/trap.c =================================================================== RCS file: /ide0.e/ncvs/src/bin/sh/trap.c,v retrieving revision 1.17 diff -u -r1.17 trap.c --- trap.c 1998/09/10 22:09:11 1.17 +++ trap.c 1999/04/01 22:37:20 @@ -362,15 +362,15 @@ /* If we are currently in a wait builtin, prepare to break it */ if ((signo == SIGINT || signo == SIGQUIT) && in_waitcmd != 0) - breakwaitcmd = 1; + breakwaitcmd = signo; /* * If a trap is set, not ignored and not the null command, we need * to make sure traps are executed even when a child blocks signals. */ - if (trap[signo] != NULL && + if (in_waitcmd != 0 && trap[signo] != NULL && ! trap[signo][0] == '\0' && ! (trap[signo][0] == ':' && trap[signo][1] == '\0')) - breakwaitcmd = 1; + breakwaitcmd = 1000*signo+in_waitcmd; } Duncan >How-To-Repeat: >Fix: >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message