From owner-freebsd-java@FreeBSD.ORG Fri Jan 18 02:56:35 2008 Return-Path: Delivered-To: java@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3E6F116A41A; Fri, 18 Jan 2008 02:56:35 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id C8C5C13C459; Fri, 18 Jan 2008 02:56:34 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m0I2ah54022279; Thu, 17 Jan 2008 21:36:43 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Thu, 17 Jan 2008 21:36:44 -0500 (EST) Date: Thu, 17 Jan 2008 21:36:43 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Julian Elischer In-Reply-To: <478FFC91.4050508@elischer.org> Message-ID: References: <200711301716.lAUHGEV1064334@repoman.freebsd.org> <90584F61-91FE-446E-978E-FD234553E8FC@threerings.net> <478FFC91.4050508@elischer.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: ivo@scito.com, Alfred Perlstein , nate@yogotech.com, Landon Fuller , davidxu@freebsd.org, java@freebsd.org, julian@freebsd.org Subject: Re: cvs commit: src/lib/libkse/thread thr_kern.c X-BeenThere: freebsd-java@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Porting Java to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Jan 2008 02:56:35 -0000 On Thu, 17 Jan 2008, Julian Elischer wrote: > Landon Fuller wrote: >> >> On Dec 2, 2007, at 09:31, Arno J. Klaassen wrote: >> >>> For info, the attached patch, which partially reverts mfc of rev 1.286 >>> >>> of kern_fork.c, seems to work as well (without the above patch to be >>> clear), >>> >> >> I just upgraded our 8-core build server from pre-november 6-STABLE to >> 6.3-RELEASE, and ran into this issue, causing our fork-heavy builder >> processes to lock up regularly. >> >> Your suggested patch (reverting the 1.286 MFC to sys/kern/kern_fork.c) >> allows our builds to run to completion; I'll try digging into this further. >> Given how easy this is to reproduce, I'm hoping this is possible to fix >> before 6.3 is officially released? > > > This is a problem.. the reason it was changed was that the > previous code results in heavily loaded threaded processes that > fork, hanging in indefinite lockups IN THE KERNEL. Eventually > the whole machine would become unuseable. In particular when > there is NFS being used but in other situations too. SO I'm > damned if I do and damned if I don't on this. > > We were able to prove to ourselves that if a program got into this > state it was a definite programming error. As was stated in the > discussion to this change: > "The change is trying to protect the user from doing something that they > shouldn't be doing anyhow." > The previous kernel tried to stop all other threads from running > and thus, stopping them from changing anything, while the > kernel copies the memory into the child process. The fact is that > the kernel can't really protect the process from doing this and > the other threads in the parent can still leave things in a state > that will screw up the child. > > I gather it is the PARENT that hangs here? It must be the child that hangs. > It's possible that the answer is that the library needs to > be changed as well. Dan, what is the library doing here? I suppose it is malloc() that is getting into an inconsistent state in the child. Creating a thread causes malloc() usage, so threads in the parent can cause the malloc lock to look like it's been locked just as the process is forked from a different thread. You might want to check out any differences between libkse in -current and libpthread in 6.x. I don't think there is an issue with -current. -- DE