From owner-freebsd-java@FreeBSD.ORG Sun Dec 2 17:31:19 2007 Return-Path: Delivered-To: java@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AB84016A417; Sun, 2 Dec 2007 17:31:19 +0000 (UTC) (envelope-from arno@heho.snv.jussieu.fr) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by mx1.freebsd.org (Postfix) with ESMTP id 34FD313C442; Sun, 2 Dec 2007 17:31:18 +0000 (UTC) (envelope-from arno@heho.snv.jussieu.fr) Received: from heho.snv.jussieu.fr (heho.snv.jussieu.fr [134.157.184.22]) by shiva.jussieu.fr (8.13.8/jtpda-5.4) with ESMTP id lB2HVE52098157 ; Sun, 2 Dec 2007 18:31:14 +0100 (CET) X-Ids: 165 Received: from heho.snv.jussieu.fr (localhost [127.0.0.1]) by heho.snv.jussieu.fr (8.13.3/jtpda-5.2) with ESMTP id lB2HVC8b016048 ; Sun, 2 Dec 2007 18:31:12 +0100 (MET) Received: (from arno@localhost) by heho.snv.jussieu.fr (8.13.3/8.13.1/Submit) id lB2HVCJr016045; Sun, 2 Dec 2007 18:31:12 +0100 (MET) (envelope-from arno) To: Daniel Eischen References: <200711301716.lAUHGEV1064334@repoman.freebsd.org> From: "Arno J. Klaassen" Date: 02 Dec 2007 18:31:12 +0100 In-Reply-To: Message-ID: Lines: 90 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (shiva.jussieu.fr [134.157.0.165]); Sun, 02 Dec 2007 18:31:14 +0100 (CET) X-Virus-Scanned: ClamAV 0.88.7/4975/Sun Dec 2 16:32:42 2007 on shiva.jussieu.fr X-Virus-Status: Clean X-Miltered: at shiva.jussieu.fr with ID 4752EBE2.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! Cc: nate@yogotech.com, java@freebsd.org, ivo@scito.com, julian@freebsd.org, davidxu@freebsd.org Subject: Re: cvs commit: src/lib/libkse/thread thr_kern.c X-BeenThere: freebsd-java@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting Java to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Dec 2007 17:31:19 -0000 --=-=-= Hello, Daniel Eischen writes: > On Sat, 1 Dec 2007, Arno J. Klaassen wrote: > > > Daniel Eischen writes: > > > >>> Arno J. Klaassen wrote: > >>> > >>> [ ... ] > >>> That gives : > >>> > >>> #0 0x000000080075d151 in _pthread_sigmask (how=3, set=0x813cc6e10, oset=0x0) > >>> at /files/bsd/src7/lib/libkse/thread/thr_sigmask.c:52 > >>> #1 0x000000080075d103 in _sigprocmask (how=3, set=0x813cc6e10, oset=0x0) > >>> at /files/bsd/src7/lib/libkse/thread/thr_sigprocmask.c:49 > >>> #2 0x000000080076c423 in _kse_single_thread (curthread=0x813cc6c00) > >>> at /files/bsd/src7/lib/libkse/thread/thr_kern.c:361 > >>> #3 0x0000000800758f29 in _fork () > >>> at /files/bsd/src7/lib/libkse/thread/thr_fork.c:101 > >>> #4 0x0000000801e43158 in jdk_fork_wrapper () > >>> at ../../../src/solaris/native/java/lang/UNIXProcess_md.c:437 > >>> > >>> Hope this is better > >> > >> Yes, this would seem to be a kernel problem, as _get_curthread() > >> seems to be returning garbage. > > > > (gdb) p curthread > > $1 = (struct pthread *) 0x0 > > > > > >> This is a libkse MD function, > >> that relies on %gs (for i386/amd64) to point to something > >> that was initialized in the parent. > >> > >> Julian, David, got any ideas? > > > > I can publish ti full java_g.core if helpful. > > You could of course try this hack to work-around the problem: > > Index: thr_kern.c > =================================================================== > RCS file: /home/ncvs/src/lib/libkse/thread/thr_kern.c,v > retrieving revision 1.127 > diff -u -r1.127 thr_kern.c > --- thr_kern.c 30 Nov 2007 17:16:14 -0000 1.127 > +++ thr_kern.c 1 Dec 2007 23:23:42 -0000 > @@ -361,6 +361,13 @@ > curthread->kse->k_kcb->kcb_kmbx.km_curthread = NULL; > curthread->attr.flags |= PTHREAD_SCOPE_SYSTEM; > > + /* > + * This shouldn't be necessary. It sometimes gets corrupted > + * after a fork() in SMP. > + */ > + _kcb_set(curthread->kse->k_kcb); > + _tcb_set(curthread->kse->k_kcb, curthread->tcb); > + > /* After a fork(), there child should have no pending signals. */ > sigemptyset(&curthread->sigpend); > > Yes, this works. Thanx! Is this safe to apply to releng_6 as well? For info, the attached patch, which partially reverts mfc of rev 1.286 of kern_fork.c, seems to work as well (without the above patch to be clear), or at least makes it much harder to trigger (just reading the comments it seems just to give one extra second to copy user space before accessing it which seems enough in my setup). Hope this helps to track down the real culprit (I do have problems with libthr and java as well on 2x2 SMP I do not have elsewhere, but they are much harder to trigger and I have not been able yet to find a simple test-setup which reproduces them easily and reproductable). Thank you very much for your help. Best, Arno --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=thread_single_rev.patch Index: sys/kern/kern_fork.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_fork.c,v retrieving revision 1.282.2.1 diff -u -r1.282.2.1 kern_fork.c --- sys/kern/kern_fork.c 6 Nov 2007 02:59:40 -0000 1.282.2.1 +++ sys/kern/kern_fork.c 1 Dec 2007 14:17:03 -0000 @@ -246,6 +246,34 @@ return (0); } + /* + * Note 1:1 allows for forking with one thread coming out on the + * other side with the expectation that the process is about to + * exec. + */ + if (p1->p_flag & P_HADTHREADS) { + /* + * Idle the other threads for a second. + * Since the user space is copied, it must remain stable. + * In addition, all threads (from the user perspective) + * need to either be suspended or in the kernel, + * where they will try restart in the parent and will + * be aborted in the child. + */ + PROC_LOCK(p1); + if (thread_single(SINGLE_NO_EXIT)) { + /* Abort. Someone else is single threading before us. */ + PROC_UNLOCK(p1); + return (ERESTART); + } + PROC_UNLOCK(p1); + /* + * All other activity in this process + * is now suspended at the user boundary, + * (or other safe places if we think of any). + */ + } + /* Allocate new proc. */ newproc = uma_zalloc(proc_zone, M_WAITOK); #ifdef MAC @@ -694,6 +722,15 @@ PROC_UNLOCK(p2); /* + * If other threads are waiting, let them continue now. + */ + if (p1->p_flag & P_HADTHREADS) { + PROC_LOCK(p1); + thread_single_end(); + PROC_UNLOCK(p1); + } + + /* * Return child proc pointer to parent. */ *procp = p2; @@ -708,6 +745,11 @@ mac_destroy_proc(newproc); #endif uma_zfree(proc_zone, newproc); + if (p1->p_flag & P_HADTHREADS) { + PROC_LOCK(p1); + thread_single_end(); + PROC_UNLOCK(p1); + } pause("fork", hz / 2); return (error); } --=-=-=--