From owner-freebsd-openoffice Tue Aug 6 9:25:37 2002 Delivered-To: freebsd-openoffice@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8E66737B405; Tue, 6 Aug 2002 09:25:28 -0700 (PDT) Received: from wall.polstra.com (wall-gw.polstra.com [206.213.73.130]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5F30B43E3B; Tue, 6 Aug 2002 09:25:27 -0700 (PDT) (envelope-from jdp@polstra.com) Received: from strings.polstra.com (strings.polstra.com [206.213.73.20]) by wall.polstra.com (8.11.3/8.11.3) with ESMTP id g76GPLf06579; Tue, 6 Aug 2002 09:25:22 -0700 (PDT) (envelope-from jdp@polstra.com) Message-ID: X-Mailer: XFMail 1.5.1 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20020806095745.M58571-100000@levais.imp.ch> Date: Tue, 06 Aug 2002 09:25:21 -0700 (PDT) Organization: Polstra & Co., Inc. From: John Polstra To: Martin Blapp Subject: Re: Help needed. Deadlock in rtld makes openoffice build hang ag Cc: hackers@FreeBSD.ORG, openoffice@FreeBSD.ORG, Alexander Kabaev Sender: owner-freebsd-openoffice@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Martin Blapp wrote: > > From 10 builds, about 6 are hanging, and I need to restart them. > > This is not a usable solution for a package building cluster. > > I end with a process consuming all CPU resources and hanging for > waiting for a lock to get released what never happens. > > Problem is exit(). Replaceing exit() with _exit() did not help. > > [Switching to Process 4968, Thread 1] > 0x28050784 in sigprocmask () from /usr/libexec/ld-elf.so.1 > (gdb) bt >#0 0x28050784 in sigprocmask () from /usr/libexec/ld-elf.so.1 >#1 0x2804f2d1 in xprintf () from /usr/libexec/ld-elf.so.1 >#2 0x2804df78 in find_symdef () from /usr/libexec/ld-elf.so.1 >#3 0x2838dbd8 in exit () from /usr/lib/libc_r.so.4 >#4 0x08048c77 in _start () > > I tried to add the following lines as proposed by Alexander Kabaev > to libexec/rtld-elf/i386/lockdflt.c [...] The lock algorithms are taken from the paper referenced in the comment at the top of lockdflt.c, and I believe they are correct. Whatever is happening must be caused by changes in libc_r that cause the profiling timer to stop advancing, or cause SIGPROF to be blocked. There were some major changes made to libc_r on 13 Oct 2000 that could be connected with this. Just as an experiment, please try this change to lockdflt_init() in rtld-elf/i386/lockdflt.c: Index: lockdflt.c =================================================================== RCS file: /home/ncvs/src/libexec/rtld-elf/i386/lockdflt.c,v retrieving revision 1.5.2.4 diff -U5 -r1.5.2.4 lockdflt.c --- lockdflt.c 11 Jul 2002 23:52:32 -0000 1.5.2.4 +++ lockdflt.c 6 Aug 2002 16:23:33 -0000 @@ -263,10 +263,11 @@ /* * Construct a mask to block all signals except traps which might * conceivably be generated within the dynamic linker itself. */ sigfillset(&fullsigmask); + sigdelset(&fullsigmask, SIGPROF); sigdelset(&fullsigmask, SIGILL); sigdelset(&fullsigmask, SIGTRAP); sigdelset(&fullsigmask, SIGABRT); sigdelset(&fullsigmask, SIGEMT); sigdelset(&fullsigmask, SIGFPE); John PS - Are you working with -stable or with -current? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-openoffice" in the body of the message