Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Aug 2015 13:43:36 +0200
From:      Michiel Boland <boland37@xs4all.nl>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Mark Martinec <Mark.Martinec+freebsd@ijs.si>, freebsd-stable@freebsd.org
Subject:   Re: Latest stable (r287104) bash leaves zombies on exit
Message-ID:  <55E19AE8.9090000@xs4all.nl>
In-Reply-To: <20150828161847.GX2072@kib.kiev.ua>
References:  <E1ZUucG-000C5n-0C@dilbert.ingresso.co.uk> <63a84f64baf8768a551fc6464e8e9526@mailbox.ijs.si> <20150827162602.GJ2072@kib.kiev.ua> <55DF5C95.90502@xs4all.nl> <20150827201644.GO2072@kib.kiev.ua> <55DFFADB.2080003@xs4all.nl> <20150828100118.GR2072@kib.kiev.ua> <55E083CA.2050705@xs4all.nl> <20150828161847.GX2072@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 08/28/2015 18:18, Konstantin Belousov wrote:
> On Fri, Aug 28, 2015 at 05:52:42PM +0200, Michiel Boland wrote:
>> set -e
>> for a in `seq 1000`
>> do
>> echo -n "$a "
>> xterm -e ssh nonexisting
>> done
>> echo ""
>>
>> (The idea here is that 'ssh nonexisting' should do some work and then exit,
>> "xterm -e false", etc. don't appear to trigger the bug.)
>>
>> Prior to the patch, one of the xterms would hang after the counter reaches a
>> random (reasonably small) number.
>>
>> After the patch the script runs till completion.
>
> Thank you for testing.  Funny detail is that your loop does not hangs for
> me, I see flapping xterms until the completion.  How many cpus does your
> machine have ?

I have a Q8300 (4 cpus) - I guess the timing matters.

Do I understand correctly that the problem is that if you install a signal 
handler with signal() (which is what xterm does) and pull in libthr.so somehow, 
then there is no thr_sighandler inserted?

I condensed the xterm problem into a small C program. Compile in such a way that 
the delay loop does not get optimized out, and link with -lpthread. Eventually, 
when executed often enough, this will hang in the same fashion as xterm does.

#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static void reapchild(int sig __unused)
{
         wait(NULL);
}

static void delay(void)
{
         long i, n;

         n = random() % 1000000;
         if (n < 0) {
                 n = -n;
         }
         for (i = 0; i < n; i++)
                 ;
}

int main()
{
         int p[2];
         char dummy;

         srandomdev();
         if (signal(SIGCHLD, reapchild) == SIG_ERR) {
                 perror("signal");
                 exit(1);
         }
         if (pipe(p) == -1) {
                 perror("pipe");
                 exit(1);
         }
         switch (fork()) {
         case -1:
                 perror("fork");
                 exit(1);
         case 0:
                 close(p[1]);
                 read(p[0], &dummy, 1);
                 _exit(0);
         }
         close(p[1]);
         read(p[0], &dummy, 1);
         delay();
         exit(0);
}

>
> Below is a slightly improved version of the change, to avoid unnecessary
> relocations.  Would be good to rebuild the world and confirm that you
> see no regression (the patch also affects rtld in some way).

Ok, I will try this patch later today.

Cheers,
Michiel




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55E19AE8.9090000>