From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 07:58:11 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 24E6116A4B3 for ; Sun, 19 Oct 2003 07:58:11 -0700 (PDT) Received: from fri.itea.ntnu.no (fri.itea.ntnu.no [129.241.7.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3337B43F75 for ; Sun, 19 Oct 2003 07:58:09 -0700 (PDT) (envelope-from morten@rodal.no) Received: from localhost (localhost [127.0.0.1]) by fri.itea.ntnu.no (Postfix) with ESMTP id C8213C7189 for ; Sun, 19 Oct 2003 16:58:07 +0200 (CEST) Received: from slurp.rodal.no (m200h.studby.ntnu.no [129.241.135.200]) by fri.itea.ntnu.no (Postfix) with ESMTP id 6EB07C71D8 for ; Sun, 19 Oct 2003 16:58:07 +0200 (CEST) Received: (from morten@localhost) by slurp.rodal.no (8.12.10/8.12.10/Submit) id h9JEw6q7064303 for threads@freebsd.org; Sun, 19 Oct 2003 16:58:06 +0200 (CEST) (envelope-from morten) Date: Sun, 19 Oct 2003 16:58:06 +0200 From: Morten Rodal To: threads@freebsd.org Message-ID: <20031019145805.GA63314@slurp.rodal.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. Subject: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Oct 2003 14:58:11 -0000 I seem to be able to crash almost every pthread program that uses pthread_mutex'es. First I thought it was a problem with pthread_testcancel(), until I compiled libkse with DEBUG_FLAGS=-g on one of machines. Backtrace from a machine with DEBUG_FLAGS=-g and libkse: #0 0x28e6ed1b in kse_thr_interrupt () at {standard input}:15 #1 0x28e5f990 in _thr_sig_add (pthread=0x81fab00, sig=136293172, info=0x0) at /usr/src/lib/libpthread/thread/thr_sig.c:885 #2 0x28e687cb in kse_check_completed (kse=0x81fab00) at /usr/src/lib/libpthread/thread/thr_kern.c:1558 #3 0x28e6721c in kse_sched_multi (kmbx=0x17e) at /usr/src/lib/libpthread/thread/thr_kern.c:1021 #4 0x08207000 in ?? () #5 0x080bc6c6 in DCTransferView::DC_DownloadManagerCallBack(CObject*) () On a machine without DEBUG_FLAGS=-g this backtrace: (gdb) bt #0 0x288e96bb in pthread_testcancel () from /usr/lib/libkse.so.1 #1 0x288e371b in pthread_mutexattr_init () from /usr/lib/libkse.so.1 #2 0x288e219c in pthread_mutexattr_init () from /usr/lib/libkse.so.1 #3 0x082d6000 in ?? () #4 0x291e2807 in nsWindow::OnContainerFocusInEvent(_GtkWidget*, _GdkEventFocus*) () from /usr/X11R6/lib/firebird/lib/mozilla-1.5/components/libwidget_gtk2.so One machine is running a kernel from FreeBSD slurp.rodal.no 5.1-CURRENT FreeBSD 5.1-CURRENT #3: Tue Oct 14 20:47:45 CEST 2003 root@slurp.rodal.no:/usr/obj/usr/src/sys/slurp i386 the other from FreeBSD hauk10.idi.ntnu.no 5.1-CURRENT FreeBSD 5.1-CURRENT #2: Fri Sep 26 09:12:55 CEST 2003 root@hauk10.idi.ntnu.no:/usr/obj/usr/src/sys/hauk10 i386 For those of you who wonder, yes it is two different applications but the first one crashed with the exact same back trace as the second one before I installed a libkse with debugging information on slurp.rodal.no. -- Morten Rodal From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 11:16:29 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E3C3016A4B3 for ; Sun, 19 Oct 2003 11:16:29 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0E61E43FA3 for ; Sun, 19 Oct 2003 11:16:29 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id h9JIGSBR013514; Sun, 19 Oct 2003 14:16:28 -0400 (EDT) Date: Sun, 19 Oct 2003 14:16:27 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Morten Rodal In-Reply-To: <20031019145805.GA63314@slurp.rodal.no> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org Subject: Re: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Oct 2003 18:16:30 -0000 On Sun, 19 Oct 2003, Morten Rodal wrote: > I seem to be able to crash almost every pthread program that uses > pthread_mutex'es. First I thought it was a problem with > pthread_testcancel(), until I compiled libkse with DEBUG_FLAGS=-g on > one of machines. > > Backtrace from a machine with DEBUG_FLAGS=-g and libkse: > > #0 0x28e6ed1b in kse_thr_interrupt () at {standard input}:15 > #1 0x28e5f990 in _thr_sig_add (pthread=0x81fab00, sig=136293172, > info=0x0) > at /usr/src/lib/libpthread/thread/thr_sig.c:885 > #2 0x28e687cb in kse_check_completed (kse=0x81fab00) > at /usr/src/lib/libpthread/thread/thr_kern.c:1558 > #3 0x28e6721c in kse_sched_multi (kmbx=0x17e) > at /usr/src/lib/libpthread/thread/thr_kern.c:1021 This is a problem. The mailbox pointer is invalid. > #4 0x08207000 in ?? () > #5 0x080bc6c6 in DCTransferView::DC_DownloadManagerCallBack(CObject*) () > > > On a machine without DEBUG_FLAGS=-g this backtrace: > > (gdb) bt > #0 0x288e96bb in pthread_testcancel () from /usr/lib/libkse.so.1 > #1 0x288e371b in pthread_mutexattr_init () from /usr/lib/libkse.so.1 > #2 0x288e219c in pthread_mutexattr_init () from /usr/lib/libkse.so.1 > #3 0x082d6000 in ?? () > #4 0x291e2807 in nsWindow::OnContainerFocusInEvent(_GtkWidget*, _GdkEventFocus*) () from /usr/X11R6/lib/firebird/lib/mozilla-1.5/components/libwidget_gtk2.so > > > One machine is running a kernel from > > FreeBSD slurp.rodal.no 5.1-CURRENT FreeBSD 5.1-CURRENT #3: Tue Oct 14 20:47:45 CEST 2003 root@slurp.rodal.no:/usr/obj/usr/src/sys/slurp i386 > > the other from > > FreeBSD hauk10.idi.ntnu.no 5.1-CURRENT FreeBSD 5.1-CURRENT #2: Fri Sep 26 09:12:55 CEST 2003 root@hauk10.idi.ntnu.no:/usr/obj/usr/src/sys/hauk10 i386 > > > For those of you who wonder, yes it is two different applications but > the first one crashed with the exact same back trace as the second one > before I installed a libkse with debugging information on > slurp.rodal.no. I'm not having any of these problems with a -current from Oct 12th on both SMP and UP systems. I'm using KDE and mozilla. How are you specifying libkse (/etc/libmap.conf or PTHREAD_LIBS)? Are you using nvidia at all? -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 11:31:33 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D6E5D16A4B3 for ; Sun, 19 Oct 2003 11:31:33 -0700 (PDT) Received: from royk.itea.ntnu.no (royk.itea.ntnu.no [129.241.190.230]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9042443FCB for ; Sun, 19 Oct 2003 11:31:32 -0700 (PDT) (envelope-from morten@rodal.no) Received: from localhost (localhost [127.0.0.1]) by royk.itea.ntnu.no (Postfix) with ESMTP id 32D9866EF2; Sun, 19 Oct 2003 20:31:31 +0200 (CEST) Received: from slurp.rodal.no (m200h.studby.ntnu.no [129.241.135.200]) by royk.itea.ntnu.no (Postfix) with ESMTP id CAA6B66ED6; Sun, 19 Oct 2003 20:31:30 +0200 (CEST) Received: (from morten@localhost) by slurp.rodal.no (8.12.10/8.12.10/Submit) id h9JIVUFR006214; Sun, 19 Oct 2003 20:31:30 +0200 (CEST) (envelope-from morten) Date: Sun, 19 Oct 2003 20:31:29 +0200 From: Morten Rodal To: Daniel Eischen Message-ID: <20031019183129.GA94145@slurp.rodal.no> References: <20031019145805.GA63314@slurp.rodal.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. cc: threads@freebsd.org Subject: Re: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Oct 2003 18:31:34 -0000 On Sun, Oct 19, 2003 at 02:16:27PM -0400, Daniel Eischen wrote: > On Sun, 19 Oct 2003, Morten Rodal wrote: > > > I seem to be able to crash almost every pthread program that uses > > pthread_mutex'es. First I thought it was a problem with > > pthread_testcancel(), until I compiled libkse with DEBUG_FLAGS=-g on > > one of machines. > > > > Backtrace from a machine with DEBUG_FLAGS=-g and libkse: > > > > #0 0x28e6ed1b in kse_thr_interrupt () at {standard input}:15 > > #1 0x28e5f990 in _thr_sig_add (pthread=0x81fab00, sig=136293172, > > info=0x0) > > at /usr/src/lib/libpthread/thread/thr_sig.c:885 > > #2 0x28e687cb in kse_check_completed (kse=0x81fab00) > > at /usr/src/lib/libpthread/thread/thr_kern.c:1558 > > #3 0x28e6721c in kse_sched_multi (kmbx=0x17e) > > at /usr/src/lib/libpthread/thread/thr_kern.c:1021 > > This is a problem. The mailbox pointer is invalid. > I thought it looked a bit strange. Any clues to what might have caused this? > > One machine is running a kernel from > > > > FreeBSD slurp.rodal.no 5.1-CURRENT FreeBSD 5.1-CURRENT #3: Tue Oct 14 20:47:45 CEST 2003 root@slurp.rodal.no:/usr/obj/usr/src/sys/slurp i386 > > > > the other from > > > > FreeBSD hauk10.idi.ntnu.no 5.1-CURRENT FreeBSD 5.1-CURRENT #2: Fri Sep 26 09:12:55 CEST 2003 root@hauk10.idi.ntnu.no:/usr/obj/usr/src/sys/hauk10 i386 > > > > > > I'm not having any of these problems with a -current from Oct 12th > on both SMP and UP systems. I'm using KDE and mozilla. > The backtrace with debugging symbols is the dcgui-qt port (net/dc-gui) which seems to use pthread mutexes quite heavily. It only starts 1 out of 10 times. > > How are you specifying libkse (/etc/libmap.conf or PTHREAD_LIBS)? > /etc/libmap.conf only contains (on both machines): libc_r.so.5 libkse.so.1 libc_r.so libkse.so > > Are you using nvidia at all? > slurp.rodal.no uses nvidia, but hauk10.idi.ntnu.no uses standard X11 ATI mach64 (I think) drivers. -- Morten Rodal From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 14:50:55 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D04C616A4B3 for ; Sun, 19 Oct 2003 14:50:55 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1BCEE43F75 for ; Sun, 19 Oct 2003 14:50:55 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id h9JLosBR027134; Sun, 19 Oct 2003 17:50:54 -0400 (EDT) Date: Sun, 19 Oct 2003 17:50:54 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Morten Rodal In-Reply-To: <20031019183129.GA94145@slurp.rodal.no> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org Subject: Re: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Oct 2003 21:50:56 -0000 On Sun, 19 Oct 2003, Morten Rodal wrote: > On Sun, Oct 19, 2003 at 02:16:27PM -0400, Daniel Eischen wrote: > > On Sun, 19 Oct 2003, Morten Rodal wrote: > > > > > I seem to be able to crash almost every pthread program that uses > > > pthread_mutex'es. First I thought it was a problem with > > > pthread_testcancel(), until I compiled libkse with DEBUG_FLAGS=-g on > > > one of machines. > > > > > > Backtrace from a machine with DEBUG_FLAGS=-g and libkse: > > > > > > #0 0x28e6ed1b in kse_thr_interrupt () at {standard input}:15 > > > #1 0x28e5f990 in _thr_sig_add (pthread=0x81fab00, sig=136293172, > > > info=0x0) > > > at /usr/src/lib/libpthread/thread/thr_sig.c:885 > > > #2 0x28e687cb in kse_check_completed (kse=0x81fab00) > > > at /usr/src/lib/libpthread/thread/thr_kern.c:1558 > > > #3 0x28e6721c in kse_sched_multi (kmbx=0x17e) > > > at /usr/src/lib/libpthread/thread/thr_kern.c:1021 > > > > This is a problem. The mailbox pointer is invalid. > > > > I thought it looked a bit strange. Any clues to what might have > caused this? When I've seen it before, it's when %gs becomes corrupted. Nvidia uses static ldt allocation and this can screw things up. If you are getting any static ldt allocations out of the kernel, that is the problem. > > > > One machine is running a kernel from > > > > > > FreeBSD slurp.rodal.no 5.1-CURRENT FreeBSD 5.1-CURRENT #3: Tue Oct 14 20:47:45 CEST 2003 root@slurp.rodal.no:/usr/obj/usr/src/sys/slurp i386 > > > > > > the other from > > > > > > FreeBSD hauk10.idi.ntnu.no 5.1-CURRENT FreeBSD 5.1-CURRENT #2: Fri Sep 26 09:12:55 CEST 2003 root@hauk10.idi.ntnu.no:/usr/obj/usr/src/sys/hauk10 i386 > > > > > > > > > > I'm not having any of these problems with a -current from Oct 12th > > on both SMP and UP systems. I'm using KDE and mozilla. > > > > The backtrace with debugging symbols is the dcgui-qt port (net/dc-gui) > which seems to use pthread mutexes quite heavily. It only starts 1 > out of 10 times. I haven't tried this. If it doesn't depend on everything in the world, I'll try it. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 14:59:08 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8D51516A4EE for ; Sun, 19 Oct 2003 14:59:08 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id BF1FE43FBF for ; Sun, 19 Oct 2003 14:59:07 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id h9JLx7BR028868; Sun, 19 Oct 2003 17:59:07 -0400 (EDT) Date: Sun, 19 Oct 2003 17:59:07 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Morten Rodal In-Reply-To: <20031019183129.GA94145@slurp.rodal.no> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org Subject: Re: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Oct 2003 21:59:08 -0000 On Sun, 19 Oct 2003, Morten Rodal wrote: > The backtrace with debugging symbols is the dcgui-qt port (net/dc-gui) > which seems to use pthread mutexes quite heavily. It only starts 1 > out of 10 times. There is no dcgui-qt; is that net/dcgui, net/dctc-gui-qt, or something else? -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Mon Aug 11 18:34:44 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9340737B401; Mon, 11 Aug 2003 18:34:44 -0700 (PDT) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id CB39143F85; Mon, 11 Aug 2003 18:34:43 -0700 (PDT) (envelope-from marcel@xcllnt.net) Received: from athlon.pn.xcllnt.net (athlon.pn.xcllnt.net [192.168.4.3]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id h7C1YhwO077050; Mon, 11 Aug 2003 18:34:43 -0700 (PDT) (envelope-from marcel@piii.pn.xcllnt.net) Received: from athlon.pn.xcllnt.net (localhost [127.0.0.1]) by athlon.pn.xcllnt.net (8.12.9/8.12.9) with ESMTP id h7C1YhF1001818; Mon, 11 Aug 2003 18:34:43 -0700 (PDT) (envelope-from marcel@athlon.pn.xcllnt.net) Received: (from marcel@localhost) by athlon.pn.xcllnt.net (8.12.9/8.12.9/Submit) id h7C1Yhvo001817; Mon, 11 Aug 2003 18:34:43 -0700 (PDT) (envelope-from marcel) From: Marcel Moolenaar To: David Xu Message-ID: <20030812013443.GA1409@athlon.pn.xcllnt.net> References: <20030811001030.GA27859@dhcp42.pn.xcllnt.net> <00a801c35fd2$9139a1b0$f001a8c0@davidwnt> <20030811234058.GA944@athlon.pn.xcllnt.net> <200308120756.36583.davidxu@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200308120756.36583.davidxu@FreeBSD.org> User-Agent: Mutt/1.5.4i cc: threads@FreeBSD.org Subject: Re: KSE/ia64: NULL thread pointer in _thr_sig_add() X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Tue, 12 Aug 2003 01:34:45 -0000 X-Original-Date: Mon, 11 Aug 2003 18:34:43 -0700 X-List-Received-Date: Tue, 12 Aug 2003 01:34:45 -0000 On Tue, Aug 12, 2003 at 07:56:36AM +0800, David Xu wrote: > > > > I think this is it. I now get sig 11, but it looks like a faulty > > use of random(). It appears random() is not thread safe and this > > particular test program uses random. > > > > I'll continue to run tests, but so far it looks like the patch is > > fixing KSE/ia64. > > It would be nice if you can run tests under directory libpthread/test, > mutex_d is useful test when I am modifying libkse. mutex_d fails at the moment. join_leak_d also coredumps. I have to look into those. -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-threads@FreeBSD.ORG Mon Sep 15 23:34:52 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A803916A4B3; Mon, 15 Sep 2003 23:34:52 -0700 (PDT) Received: from grus.itea.ntnu.no (grus.itea.ntnu.no [129.241.190.232]) by mx1.FreeBSD.org (Postfix) with ESMTP id 70B7143FE9; Mon, 15 Sep 2003 23:34:51 -0700 (PDT) (envelope-from morten@rodal.no) Received: from localhost (localhost [127.0.0.1]) by grus.itea.ntnu.no (Postfix) with ESMTP id 4825DC300F; Tue, 16 Sep 2003 08:34:50 +0200 (CEST) Received: from slurp.rodal.no (m200h.studby.ntnu.no [129.241.135.200]) by grus.itea.ntnu.no (Postfix) with ESMTP id E30FEC300E; Tue, 16 Sep 2003 08:34:49 +0200 (CEST) Received: (from morten@localhost) by slurp.rodal.no (8.12.9/8.12.9/Submit) id h8G6YnNp001044; Tue, 16 Sep 2003 08:34:49 +0200 (CEST) (envelope-from morten) From: Morten Rodal To: deischen@FreeBSD.org Message-ID: <20030916063449.GB813@slurp.rodal.no> References: <20030915200204.F2297@news1.macomnet.ru> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="VywGB/WGlW4DM4P8" Content-Disposition: inline In-Reply-To: X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. cc: threads@FreeBSD.org cc: David Xu Subject: Re: libthr/libkse and Mozilla Firebird X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Tue, 16 Sep 2003 06:34:52 -0000 X-Original-Date: Tue, 16 Sep 2003 08:34:49 +0200 X-List-Received-Date: Tue, 16 Sep 2003 06:34:52 -0000 --VywGB/WGlW4DM4P8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 15, 2003 at 04:09:03PM -0400, Daniel Eischen wrote: > On Mon, 15 Sep 2003, Maxim Konovalov wrote: >=20 > > On Mon, 15 Sep 2003, 19:56+0400, Maxim Konovalov wrote: > >=20 > > > [...] > > > > Do you have any malloc options set, or any kernel options set > > > > > > no malloc options. > > > > > > > that are different from GENERIC? > > > > > > here you are: http://maxim.int.ru/stuff/GOLF5 > > > > > > The only option I am worry about is 'options HZ 1000'. Could it > > > interfere with libkse? > >=20 > > From ktrace.out the last things firebird done were: > >=20 > > # kdump -f ktrace.out | tail -20 > > 37436 MozillaFirebird-bin CALL socket(0x2,0x1,0) > > 37436 MozillaFirebird-bin RET socket 31/0x1f > > 37436 MozillaFirebird-bin CALL fcntl(0x1f,0x3,0x10) > > 37436 MozillaFirebird-bin RET fcntl 2 > > 37436 MozillaFirebird-bin CALL fcntl(0x1f,0x4,0x6) > > 37436 MozillaFirebird-bin RET fcntl 0 > > 37436 MozillaFirebird-bin CALL connect(0x1f,0xbfaedc48,0x10) > > 37436 MozillaFirebird-bin RET connect -1 errno 36 Operation now in p= rogress > > 37436 MozillaFirebird-bin CALL poll(0xbfaedce8,0x6,0xffffffff) > > 37436 MozillaFirebird-bin RET fork 0 > > 37436 MozillaFirebird-bin CALL gettimeofday(0xbfabaf10,0) > > 37436 MozillaFirebird-bin RET gettimeofday 0 > > 37436 MozillaFirebird-bin CALL gettimeofday(0xbfabaee0,0) > > 37436 MozillaFirebird-bin RET gettimeofday 0 > > 37436 MozillaFirebird-bin PSIG SIGSEGV caught handler=3D0x484f62e0 ma= sk=3D0xfffefaff code=3D0xc > > 37436 MozillaFirebird-bin CALL unlink(0x81c4440) > > 37436 MozillaFirebird-bin NAMI "/home/maxim/.phoenix/default/17ma97wm= =2Eslt/lock" > > 37436 MozillaFirebird-bin RET unlink 0 > > 37436 MozillaFirebird-bin CALL exit(0xb) I get the same(?) ktrace when my firebird disappears. 66628 MozillaFirebird-bin CALL break(0x90a1000) 66628 MozillaFirebird-bin RET break 0 66628 MozillaFirebird-bin PSIG SIGSEGV caught handler=3D0x288b72e0 mask= =3D0xfffefaff code=3D0xc 66628 MozillaFirebird-bin CALL unlink(0x81c9040) 66628 MozillaFirebird-bin NAMI "/usr/home/morten/.phoenix/default/gd75op6b= =2Eslt/lock" 66628 MozillaFirebird-bin RET unlink 0 66628 MozillaFirebird-bin CALL exit(0xb) >=20 > Can you run it under the debugger? I was able to get > mozilla to run under the debugger, but had to be root > for it to work. >=20 How are you able to run it under the debugger? Whenever I try (haven't tried as root) my computer panics! For more info on the panic (if it has anything to do with debugging libkse programs) see this thread on current@ Message-ID: <20030912065458.GA604@atlantis.rodal.no> or http://lists.freebsd.org/pipermail/freebsd-current/2003-September/010356.ht= ml --=20 Morten Rodal --VywGB/WGlW4DM4P8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/Zq8JbWe1Cy11WVsRAs/CAJ4yfDcBbKSNuOi+xtP1GWFP1WXO/gCgreuO rIcp5sB4WlsoJOq7W8578C0= =ViYA -----END PGP SIGNATURE----- --VywGB/WGlW4DM4P8-- From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 21:16:57 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 782D916A4B3; Sun, 19 Oct 2003 21:16:57 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5E5E243F3F; Sun, 19 Oct 2003 21:16:54 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id h9K4GrBR011938; Mon, 20 Oct 2003 00:16:53 -0400 (EDT) Date: Mon, 20 Oct 2003 00:16:53 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Marcel Moolenaar In-Reply-To: <20030812013443.GA1409@athlon.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: David Xu Subject: Re: KSE/ia64: NULL thread pointer in _thr_sig_add() X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: deischen@freebsd.org List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2003 04:16:57 -0000 On Mon, 11 Aug 2003, Marcel Moolenaar wrote: > On Tue, Aug 12, 2003 at 07:56:36AM +0800, David Xu wrote: > > > > > > I think this is it. I now get sig 11, but it looks like a faulty > > > use of random(). It appears random() is not thread safe and this > > > particular test program uses random. > > > > > > I'll continue to run tests, but so far it looks like the patch is > > > fixing KSE/ia64. > > > > It would be nice if you can run tests under directory libpthread/test, > > mutex_d is useful test when I am modifying libkse. > > mutex_d fails at the moment. join_leak_d also coredumps. I have to > look into those. Note that mutex_d won't work on an SMP system; it expects the concurrency level to be 1. I'll fix it shortly. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 21:30:15 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 45A8716A4B3 for ; Sun, 19 Oct 2003 21:30:15 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7A5C443FD7 for ; Sun, 19 Oct 2003 21:30:12 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id h9K4UBBR014628; Mon, 20 Oct 2003 00:30:11 -0400 (EDT) Date: Mon, 20 Oct 2003 00:30:11 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Morten Rodal In-Reply-To: <20031019183129.GA94145@slurp.rodal.no> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org Subject: Re: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: deischen@freebsd.org List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2003 04:30:15 -0000 On Sun, 19 Oct 2003, Morten Rodal wrote: > On Sun, Oct 19, 2003 at 02:16:27PM -0400, Daniel Eischen wrote: > > On Sun, 19 Oct 2003, Morten Rodal wrote: > > > > > I seem to be able to crash almost every pthread program that uses > > > pthread_mutex'es. First I thought it was a problem with > > > pthread_testcancel(), until I compiled libkse with DEBUG_FLAGS=-g on > > > one of machines. > > > > > > Backtrace from a machine with DEBUG_FLAGS=-g and libkse: > > > > > > #0 0x28e6ed1b in kse_thr_interrupt () at {standard input}:15 > > > #1 0x28e5f990 in _thr_sig_add (pthread=0x81fab00, sig=136293172, > > > info=0x0) > > > at /usr/src/lib/libpthread/thread/thr_sig.c:885 > > > #2 0x28e687cb in kse_check_completed (kse=0x81fab00) > > > at /usr/src/lib/libpthread/thread/thr_kern.c:1558 > > > #3 0x28e6721c in kse_sched_multi (kmbx=0x17e) > > > at /usr/src/lib/libpthread/thread/thr_kern.c:1021 > > > > This is a problem. The mailbox pointer is invalid. > > > > I thought it looked a bit strange. Any clues to what might have > caused this? > > > > One machine is running a kernel from > > > > > > FreeBSD slurp.rodal.no 5.1-CURRENT FreeBSD 5.1-CURRENT #3: Tue Oct 14 20:47:45 CEST 2003 root@slurp.rodal.no:/usr/obj/usr/src/sys/slurp i386 > > > > > > the other from > > > > > > FreeBSD hauk10.idi.ntnu.no 5.1-CURRENT FreeBSD 5.1-CURRENT #2: Fri Sep 26 09:12:55 CEST 2003 root@hauk10.idi.ntnu.no:/usr/obj/usr/src/sys/hauk10 i386 > > > > > > > > > > I'm not having any of these problems with a -current from Oct 12th > > on both SMP and UP systems. I'm using KDE and mozilla. > > > > The backtrace with debugging symbols is the dcgui-qt port (net/dc-gui) > which seems to use pthread mutexes quite heavily. It only starts 1 > out of 10 times. It seems to work OK here. I can start it; I'm not sure what to do with it once it's started, but it does start consistently. $ pkg_which dc-qt dctc-gui-qt-0.0.6 $ uname -a FreeBSD sirius 5.1-CURRENT FreeBSD 5.1-CURRENT #10: Sun Oct 12 12:48:45 EDT 2003 root@sirius:/usr/obj/opt/FreeBSD/src/src/sys/sirius i386 This is on an SMP system. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 22:44:55 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8F44916A4B3 for ; Sun, 19 Oct 2003 22:44:55 -0700 (PDT) Received: from fri.itea.ntnu.no (fri.itea.ntnu.no [129.241.7.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id 827E843F85 for ; Sun, 19 Oct 2003 22:44:54 -0700 (PDT) (envelope-from morten@rodal.no) Received: from localhost (localhost [127.0.0.1]) by fri.itea.ntnu.no (Postfix) with ESMTP id 29460C7378; Mon, 20 Oct 2003 07:44:53 +0200 (CEST) Received: from slurp.rodal.no (m200h.studby.ntnu.no [129.241.135.200]) by fri.itea.ntnu.no (Postfix) with ESMTP id E4224C72E1; Mon, 20 Oct 2003 07:44:52 +0200 (CEST) Received: (from morten@localhost) by slurp.rodal.no (8.12.10/8.12.10/Submit) id h9K5ip2U039918; Mon, 20 Oct 2003 07:44:51 +0200 (CEST) (envelope-from morten) Date: Mon, 20 Oct 2003 07:44:51 +0200 From: Morten Rodal To: Daniel Eischen Message-ID: <20031020054450.GA39716@slurp.rodal.no> References: <20031019183129.GA94145@slurp.rodal.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. cc: threads@freebsd.org Subject: Re: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2003 05:44:55 -0000 On Sun, Oct 19, 2003 at 05:50:54PM -0400, Daniel Eischen wrote: > On Sun, 19 Oct 2003, Morten Rodal wrote: > > > On Sun, Oct 19, 2003 at 02:16:27PM -0400, Daniel Eischen wrote: > > > On Sun, 19 Oct 2003, Morten Rodal wrote: > > > > > > > I seem to be able to crash almost every pthread program that uses > > > > pthread_mutex'es. First I thought it was a problem with > > > > pthread_testcancel(), until I compiled libkse with DEBUG_FLAGS=-g on > > > > one of machines. > > > > > > > > Backtrace from a machine with DEBUG_FLAGS=-g and libkse: > > > > > > > > #0 0x28e6ed1b in kse_thr_interrupt () at {standard input}:15 > > > > #1 0x28e5f990 in _thr_sig_add (pthread=0x81fab00, sig=136293172, > > > > info=0x0) > > > > at /usr/src/lib/libpthread/thread/thr_sig.c:885 > > > > #2 0x28e687cb in kse_check_completed (kse=0x81fab00) > > > > at /usr/src/lib/libpthread/thread/thr_kern.c:1558 > > > > #3 0x28e6721c in kse_sched_multi (kmbx=0x17e) > > > > at /usr/src/lib/libpthread/thread/thr_kern.c:1021 > > > > > > This is a problem. The mailbox pointer is invalid. > > > > > > > I thought it looked a bit strange. Any clues to what might have > > caused this? > > When I've seen it before, it's when %gs becomes corrupted. Nvidia > uses static ldt allocation and this can screw things up. If you > are getting any static ldt allocations out of the kernel, that is > the problem. > Ok. Then that's probably my problem. I did see some warnings about pids using static ldt allocation, but they haven't appeared for a while now. So my threading support is in the hands of nvidia now. That kind of sucks, but I guess I can live with it.. -- Morten Rodal From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 22:46:24 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 77C6C16A4B3 for ; Sun, 19 Oct 2003 22:46:24 -0700 (PDT) Received: from signal.itea.ntnu.no (signal.itea.ntnu.no [129.241.190.231]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6C89F43FAF for ; Sun, 19 Oct 2003 22:46:23 -0700 (PDT) (envelope-from morten@rodal.no) Received: from localhost (localhost [127.0.0.1]) by signal.itea.ntnu.no (Postfix) with ESMTP id 3D0A933950; Mon, 20 Oct 2003 07:46:22 +0200 (CEST) Received: from slurp.rodal.no (m200h.studby.ntnu.no [129.241.135.200]) by signal.itea.ntnu.no (Postfix) with ESMTP id 0E4F933941; Mon, 20 Oct 2003 07:46:22 +0200 (CEST) Received: (from morten@localhost) by slurp.rodal.no (8.12.10/8.12.10/Submit) id h9K5kLwD039940; Mon, 20 Oct 2003 07:46:21 +0200 (CEST) (envelope-from morten) Date: Mon, 20 Oct 2003 07:46:21 +0200 From: Morten Rodal To: Daniel Eischen Message-ID: <20031020054621.GB39716@slurp.rodal.no> References: <20031019183129.GA94145@slurp.rodal.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. cc: threads@freebsd.org Subject: Re: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2003 05:46:24 -0000 On Sun, Oct 19, 2003 at 05:59:07PM -0400, Daniel Eischen wrote: > On Sun, 19 Oct 2003, Morten Rodal wrote: > > > The backtrace with debugging symbols is the dcgui-qt port (net/dc-gui) > > which seems to use pthread mutexes quite heavily. It only starts 1 > > out of 10 times. > > There is no dcgui-qt; is that net/dcgui, net/dctc-gui-qt, or something > else? > I seem to have confused the actual binary name (dcgui-qt) with the port name which is net/dcgui. Sorry about that. -- Morten Rodal From owner-freebsd-threads@FreeBSD.ORG Sun Oct 19 22:52:08 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E2EFF16A4B3; Sun, 19 Oct 2003 22:52:08 -0700 (PDT) Received: from signal.itea.ntnu.no (signal.itea.ntnu.no [129.241.190.231]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0B5CF43FBF; Sun, 19 Oct 2003 22:52:08 -0700 (PDT) (envelope-from morten@rodal.no) Received: from localhost (localhost [127.0.0.1]) by signal.itea.ntnu.no (Postfix) with ESMTP id 4673133656; Mon, 20 Oct 2003 07:52:07 +0200 (CEST) Received: from slurp.rodal.no (m200h.studby.ntnu.no [129.241.135.200]) by signal.itea.ntnu.no (Postfix) with ESMTP id 17CC3337E4; Mon, 20 Oct 2003 07:52:07 +0200 (CEST) Received: (from morten@localhost) by slurp.rodal.no (8.12.10/8.12.10/Submit) id h9K5q66t039993; Mon, 20 Oct 2003 07:52:06 +0200 (CEST) (envelope-from morten) Date: Mon, 20 Oct 2003 07:52:06 +0200 From: Morten Rodal To: deischen@freebsd.org Message-ID: <20031020055206.GC39716@slurp.rodal.no> References: <20031019183129.GA94145@slurp.rodal.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. cc: threads@freebsd.org Subject: Re: libkse and bus error X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2003 05:52:09 -0000 On Mon, Oct 20, 2003 at 12:30:11AM -0400, Daniel Eischen wrote: > It seems to work OK here. I can start it; I'm not sure what to do with > it once it's started, but it does start consistently. > > $ pkg_which dc-qt > dctc-gui-qt-0.0.6 > $ uname -a > FreeBSD sirius 5.1-CURRENT FreeBSD 5.1-CURRENT #10: Sun Oct 12 12:48:45 EDT 2003 > root@sirius:/usr/obj/opt/FreeBSD/src/src/sys/sirius i386 > You probably flipped a coin as to which of the two dc*gui* ports to install. At least from the name it seems as if this is the dctc-gui, which I do not have. I picked the net/dcgui (which brings in net/dclib where the actual threading class is). > > This is on an SMP system. > I too am running a SMP system. But since we have established a sort of conclusion that I won't have KSE support fully working until: a) nvidia fixes their driver b) I get the X11 driver to work with my monitor I think we can just forget about this problem and focus on other things :) -- Morten Rodal From owner-freebsd-threads@FreeBSD.ORG Mon Oct 20 11:01:37 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 42FD816A4E0 for ; Mon, 20 Oct 2003 11:01:37 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 44DC743FDD for ; Mon, 20 Oct 2003 11:01:24 -0700 (PDT) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (peter@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id h9KI1OFY099025 for ; Mon, 20 Oct 2003 11:01:24 -0700 (PDT) (envelope-from owner-bugmaster@freebsd.org) Received: (from peter@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id h9KI1Oj9099019 for freebsd-threads@freebsd.org; Mon, 20 Oct 2003 11:01:24 -0700 (PDT) (envelope-from owner-bugmaster@freebsd.org) Date: Mon, 20 Oct 2003 11:01:24 -0700 (PDT) Message-Id: <200310201801.h9KI1Oj9099019@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: peter set sender to owner-bugmaster@freebsd.org using -f From: FreeBSD bugmaster To: freebsd-threads@FreeBSD.org Subject: Current problem reports assigned to you X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2003 18:01:37 -0000 Current FreeBSD problem reports Critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/06/13] kern/19247 threads uthread_sigaction.c does not do anything o [2002/01/16] kern/33951 threads pthread_cancel is ignored 2 problems total. Serious problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/07/18] kern/20016 threads pthreads: Cannot set scheduling timer/Can o [2000/08/26] misc/20861 threads libc_r does not honor socket timeouts o [2001/01/19] bin/24472 threads libc_r does not honor SO_SNDTIMEO/SO_RCVT o [2001/01/25] bin/24632 threads libc_r delicate deviation from libc in ha o [2001/01/25] misc/24641 threads pthread_rwlock_rdlock can deadlock o [2001/04/02] bin/26307 threads libc_r aborts when using the KDE media pl o [2001/10/31] bin/31661 threads pthread_kill signal handler doesn't get s o [2001/11/26] bin/32295 threads pthread dont dequeue signals o [2002/02/01] i386/34536 threads accept() blocks other threads o [2002/03/07] bin/35622 threads sigaltstack is missing in libc_r o [2002/05/25] kern/38549 threads the procces compiled whith pthread stoppe o [2002/06/27] bin/39922 threads [PATCH?] Threaded applications executed w o [2002/08/04] misc/41331 threads Pthread library open sets O_NONBLOCK flag o [2002/10/10] kern/43887 threads abnormal CPU useage when use pthread_mute o [2003/03/02] bin/48856 threads Setting SIGCHLD to SIG_IGN still leaves z o [2003/03/10] bin/49087 threads Signals lost in programs linked with libc a [2003/04/08] bin/50733 threads buildworld won't build, because of linkin o [2003/05/07] bin/51949 threads thread in accept cannot be cancelled o [2003/05/30] kern/52817 threads top(1) shows garbage for threaded process 19 problems total. Non-critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/05/25] misc/18824 threads gethostbyname is not thread safe o [2000/10/21] misc/22190 threads A threaded read(2) from a socketpair(2) f o [2001/09/09] bin/30464 threads pthread mutex attributes -- pshared o [2002/05/02] bin/37676 threads libc_r: msgsnd(), msgrcv(), pread(), pwri o [2002/07/16] misc/40671 threads pthread_cancel doesn't remove thread from 5 problems total. From owner-freebsd-threads@FreeBSD.ORG Wed Oct 22 14:51:31 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 85F0B16A4B3 for ; Wed, 22 Oct 2003 14:51:31 -0700 (PDT) Received: from phantom.cris.net (phantom.cris.net [212.110.130.74]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0232243FB1 for ; Wed, 22 Oct 2003 14:51:27 -0700 (PDT) (envelope-from phantom@FreeBSD.org.ua) Received: (from phantom@localhost) by phantom.cris.net (8.12.6/8.12.6) id h9MM0csb071555; Thu, 23 Oct 2003 01:00:38 +0300 (EEST) (envelope-from phantom) Date: Thu, 23 Oct 2003 01:00:38 +0300 From: Alexey Zelkin To: threads@freebsd.org Message-ID: <20031023010038.A71141@phantom.cris.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i X-Operating-System: FreeBSD 4.7-STABLE i386 Subject: libc_r & direct usage of syscalls X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Oct 2003 21:51:31 -0000 hi, Some of you may remember a story about strange problems I had with native jdk14 and fork() calls. In few words -- sometimes, in absolutely random order JVM just after call to fork() function become unusable due to SIGBUS signal storm (JVM signal handler decided that this signal is not fatal and did not stop an application). Today I have completely tracked it down. Or correctly to say got a 100% reproducible .java testcase and wrote few more .c testcases in order to prove my point of view. JVM is using internally usual stack protection logic. Every two pages on borders of stack are protected with mmap(). When something accesses it SIGBUS is generated and signal handler forces overflowing thread to rollback some operation until it may safely continue its job. fork() is special case here. When fork() is called, child process is need to reinitialize a libc_r internal state (this job is done by fork() wrapper located in libc_r/uthread/uthread_fork.c). One of steps of reinitialization process is free()'ing of pthreads stacks. Caveat here is unchanged protections on stack pages. Right after some stacks are free()'ed, malloc internal (struct pginfo *) info got allocated into protected region and this info being changed we get a big *KABOOM* (i.e. SIGBUS). Original code looked like: [..] pid = fork(); if (pid == 0) { make_pipes(); close_descriptors(); execvp(); } [..] Signal was arisen exactly while fork() in all cases. I changed it into: [..] pthread_suspend_all_np(); pid = __sys_fork(); if (pid == 0) { make_pipes(); close_descriptors(); execvp(); } pthread_resume_all_np(); [..] Per my overview I should not expect problems with libc_r at -STABLE. But I am worrying about -CURRENT (espessially KSE) -- may such hack have side effects ? Comments and any input on potential problems is welcome! PS: This description maybe useful to somebody who also affected by same scenario (stack protections + libc_r's fork()), so I provide example of backtrace which is signaling about a problem: : #0 0xbfbfffa8 in ?? () : #1 0x280f7b1d in free (ptr=0x828d000) : at /home/phantom/src/lib/libc_r/../libc/stdlib/malloc.c:1096 : #2 0x280b4b57 in _fork () : at /home/phantom/src/lib/libc_r/uthread/uthread_fork.c:154 [..] From owner-freebsd-threads@FreeBSD.ORG Wed Oct 22 19:06:12 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E5CAB16A4B3; Wed, 22 Oct 2003 19:06:12 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7257B43FB1; Wed, 22 Oct 2003 19:06:11 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id h9N26ABR001393; Wed, 22 Oct 2003 22:06:10 -0400 (EDT) Date: Wed, 22 Oct 2003 22:06:10 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Alexey Zelkin In-Reply-To: <20031023010038.A71141@phantom.cris.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org Subject: Re: libc_r & direct usage of syscalls X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Oct 2003 02:06:13 -0000 On Thu, 23 Oct 2003, Alexey Zelkin wrote: > hi, > > Some of you may remember a story about strange problems I had > with native jdk14 and fork() calls. > > In few words -- sometimes, in absolutely random order JVM just after > call to fork() function become unusable due to SIGBUS signal storm > (JVM signal handler decided that this signal is not fatal and did not > stop an application). > > Today I have completely tracked it down. Or correctly to say > got a 100% reproducible .java testcase and wrote few more .c testcases in > order to prove my point of view. > > JVM is using internally usual stack protection logic. Every two pages on > borders of stack are protected with mmap(). When something accesses > it SIGBUS is generated and signal handler forces overflowing thread > to rollback some operation until it may safely continue its job. > > fork() is special case here. When fork() is called, child process > is need to reinitialize a libc_r internal state (this job is done by > fork() wrapper located in libc_r/uthread/uthread_fork.c). One of steps > of reinitialization process is free()'ing of pthreads stacks. Caveat here > is unchanged protections on stack pages. Right after some stacks are > free()'ed, malloc internal (struct pginfo *) info got allocated into > protected region and this info being changed we get a big *KABOOM* (i.e. > SIGBUS). Here's what POSIX has to say about fork() and threaded processes: A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called. [THR] Fork handlers may be established by means of the pthread_atfork() function in order to maintain application invariants across fork() calls. When the application calls fork() from a signal handler and any of the fork handlers registered by pthread_atfork() calls a function that is not asynch-signal-safe, the behavior is undefined. Libkse currently doesn't do any reinitialization of internal library state (libc _or_ libkse) on a fork(). You cannot rely on libc state (malloc state, e.g.) or libkse state after a fork(). For what purpose is fork() being used by the JVM? -- Dan Eischen