From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 00:12:46 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4EE2916A4CE; Mon, 21 Jun 2004 00:12:46 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id B898443D5A; Mon, 21 Jun 2004 00:12:41 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i5L0ANRM018858; Sun, 20 Jun 2004 20:10:23 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i5L0ANDY018855; Sun, 20 Jun 2004 20:10:23 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Sun, 20 Jun 2004 20:10:23 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: threads@FreeBSD.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@FreeBSD.org Subject: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 00:12:46 -0000 I've now seen the following scenario happen several times in the last few days while doing testing and benchmarking: I run a multi-threaded test, such as super-smack, that causes a moderately high system load. I then hit Ctrl-T or run top, or some other activity that forces calcru() to execute. I will not infrequently get an extremely hard hang -- can't get to DDB using serial break, etc. I don't remember it happening when using non-threaded apps, so I'm wondering if there's a poor interaction with KSE/scheduler/who knows what. 7:55PM up 6 mins, 2 users, load averages: 1.37, 0.91, 0.43 USER TTY FROM LOGIN@ IDLE WHAT root d0 - 7:55PM - w rwatson p0 cboss 7:50PM 2 super-smack select-key hippy# top calcru: negative time of 1834075 usec for pid 654 (super-smack) ca~~ In this case, I ran super-smack with the following parameters: hippy:/usr/tmp/super-smack> super-smack select-key.smack 15 1000 This generates 15 workers, which should cause mysql to spawn off threads as well. I'm running with stock libpthread on this system (slightly old) but an up-to-date kernel from CVS, GENERIC. Has anyone else seen this? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 00:15:20 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4B08316A4CE; Mon, 21 Jun 2004 00:15:20 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id D2F0943D54; Mon, 21 Jun 2004 00:15:17 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i5L0ClMO018887; Sun, 20 Jun 2004 20:12:47 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i5L0Ck0j018884; Sun, 20 Jun 2004 20:12:47 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Sun, 20 Jun 2004 20:12:46 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: threads@FreeBSD.org In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 00:15:20 -0000 FYI, this is a Xeon box with two physical processors, each with two logical processors, and the problem could well be SMP-related. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research On Sun, 20 Jun 2004, Robert Watson wrote: > > I've now seen the following scenario happen several times in the last few > days while doing testing and benchmarking: I run a multi-threaded test, > such as super-smack, that causes a moderately high system load. I then > hit Ctrl-T or run top, or some other activity that forces calcru() to > execute. I will not infrequently get an extremely hard hang -- can't get > to DDB using serial break, etc. I don't remember it happening when using > non-threaded apps, so I'm wondering if there's a poor interaction with > KSE/scheduler/who knows what. > > 7:55PM up 6 mins, 2 users, load averages: 1.37, 0.91, 0.43 > USER TTY FROM LOGIN@ IDLE WHAT > root d0 - 7:55PM - w > rwatson p0 cboss 7:50PM 2 super-smack > select-key > hippy# top > calcru: negative time of 1834075 usec for pid 654 (super-smack) > ca~~ > > In this case, I ran super-smack with the following parameters: > > hippy:/usr/tmp/super-smack> super-smack select-key.smack 15 1000 > > This generates 15 workers, which should cause mysql to spawn off threads > as well. I'm running with stock libpthread on this system (slightly old) > but an up-to-date kernel from CVS, GENERIC. > > Has anyone else seen this? > > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects > robert@fledge.watson.org Senior Research Scientist, McAfee Research > > From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 02:58:38 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CBCC416A4CE; Mon, 21 Jun 2004 02:58:38 +0000 (GMT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 58AC543D1D; Mon, 21 Jun 2004 02:58:38 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.11/8.12.11) with ESMTP id i5L2wTKF047144; Sun, 20 Jun 2004 19:58:33 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200406210258.i5L2wTKF047144@gw.catspoiler.org> Date: Sun, 20 Jun 2004 19:58:29 -0700 (PDT) From: Don Lewis To: rwatson@FreeBSD.org In-Reply-To: MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: threads@FreeBSD.org cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 02:58:38 -0000 On 20 Jun, Robert Watson wrote: > FYI, this is a Xeon box with two physical processors, each with two > logical processors, and the problem could well be SMP-related. The hang might be SMP-related, but I just started getting the calcru messages on my UP Athlon XP box. I just upgraded to today's -CURRENT from a week-old version, and my console got spammed by a bunch of these messages while I was running portupgrade. calcru: negative time of 0 usec for pid 21261 (sh) calcru: negative time of 0 usec for pid 21261 (sh) calcru: negative time of 0 usec for pid 22260 (sh) calcru: negative time of 0 usec for pid 22260 (sh) calcru: negative time of 0 usec for pid 22560 (sh) calcru: negative time of 0 usec for pid 22560 (sh) calcru: negative time of 0 usec for pid 29257 (sh) calcru: negative time of 0 usec for pid 29257 (sh) calcru: negative time of 0 usec for pid 45341 (sh) calcru: negative time of 0 usec for pid 45341 (sh) calcru: negative time of 3917 usec for pid 49504 (sh) calcru: negative time of 3917 usec for pid 49504 (sh) calcru: negative time of 0 usec for pid 55558 (sh) calcru: negative time of 0 usec for pid 55558 (sh) calcru: negative time of 0 usec for pid 60591 (sh) calcru: negative time of 0 usec for pid 60591 (sh) calcru: negative time of 0 usec for pid 62769 (sh) calcru: negative time of 0 usec for pid 62769 (sh) calcru: negative time of 0 usec for pid 75079 (sh) calcru: negative time of 0 usec for pid 75079 (sh) calcru: negative time of 0 usec for pid 83060 (sh) calcru: negative time of 0 usec for pid 83060 (sh) calcru: negative time of 0 usec for pid 85556 (sh) calcru: negative time of 0 usec for pid 85556 (sh) calcru: negative time of 0 usec for pid 94309 (sh) calcru: negative time of 0 usec for pid 94309 (sh) calcru: negative time of 0 usec for pid 13370 (sh) calcru: negative time of 0 usec for pid 13370 (sh) calcru: negative time of 0 usec for pid 27636 (sh) calcru: negative time of 0 usec for pid 27636 (sh) calcru: negative time of 4211 usec for pid 36727 (sh) calcru: negative time of 4211 usec for pid 36727 (sh) calcru: negative time of 0 usec for pid 40010 (sh) calcru: negative time of 0 usec for pid 40010 (sh) calcru: negative time of 0 usec for pid 54561 (sh) calcru: negative time of 0 usec for pid 54561 (sh) calcru: negative time of 9398 usec for pid 59554 (sed) calcru: negative time of 9398 usec for pid 59554 (sed) calcru: negative time of 4094 usec for pid 60986 (sh) calcru: negative time of 4094 usec for pid 60986 (sh) calcru: negative time of 0 usec for pid 61839 (sh) calcru: negative time of 0 usec for pid 61839 (sh) calcru: negative time of 4154 usec for pid 66500 (sh) calcru: negative time of 4154 usec for pid 66500 (sh) calcru: negative time of 0 usec for pid 70950 (sh) calcru: negative time of 0 usec for pid 70950 (sh) calcru: negative time of 4175 usec for pid 88089 (sh) calcru: negative time of 4175 usec for pid 88089 (sh) From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 03:30:11 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A1F7116A4D0 for ; Mon, 21 Jun 2004 03:30:11 +0000 (GMT) Received: from mail.FreeBSD.org.cn (dns3.freebsd.org.cn [61.129.66.75]) by mx1.FreeBSD.org (Postfix) with ESMTP id DBE2643D1D for ; Mon, 21 Jun 2004 03:30:09 +0000 (GMT) (envelope-from delphij@frontfree.net) Received: (qmail 8961 invoked by uid 0); 21 Jun 2004 03:29:16 -0000 Received: from unknown (HELO beastie.frontfree.net) (218.107.145.7) by mail.FreeBSD.org.cn with AES256-SHA encrypted SMTP; 21 Jun 2004 03:29:16 -0000 Received: from localhost (localhost.frontfree.net [127.0.0.1]) by beastie.frontfree.net (Postfix) with ESMTP id 7542211509; Mon, 21 Jun 2004 11:29:42 +0800 (CST) Received: from beastie.frontfree.net ([127.0.0.1]) by localhost (beastie.frontfree.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 01728-01; Mon, 21 Jun 2004 11:29:41 +0800 (CST) Received: by beastie.frontfree.net (Postfix, from userid 1001) id 3163111499; Mon, 21 Jun 2004 11:29:39 +0800 (CST) Date: Mon, 21 Jun 2004 11:29:39 +0800 From: Xin LI To: Robert Watson Message-ID: <20040621032939.GA1909@frontfree.net> References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="4Ckj6UjgE2iN1+kY" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-GPG-key-ID/Fingerprint: 0xCAEEB8C0 / 43B8 B703 B8DD 0231 B333 DC28 39FB 93A0 CAEE B8C0 X-GPG-Public-Key: http://www.delphij.net/delphij.asc X-Operating-System: FreeBSD beastie.frontfree.net 5.2-delphij FreeBSD 5.2-delphij #77: Sun Jun 20 21:58:10 CST 2004 root@:/usr/obj/usr/src/sys/BEASTIE i386 X-URL: http://www.delphij.net X-By: delphij@beastie.frontfree.net X-Location: Beijing, China X-Virus-Scanned: by amavisd-new at frontfree.net cc: threads@FreeBSD.org cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 03:30:11 -0000 --4Ckj6UjgE2iN1+kY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jun 20, 2004 at 08:10:23PM -0400, Robert Watson wrote: >=20 > I've now seen the following scenario happen several times in the last few > days while doing testing and benchmarking: I run a multi-threaded test, > such as super-smack, that causes a moderately high system load. I then > hit Ctrl-T or run top, or some other activity that forces calcru() to > execute. I will not infrequently get an extremely hard hang -- can't get > to DDB using serial break, etc. I don't remember it happening when using > non-threaded apps, so I'm wondering if there's a poor interaction with > KSE/scheduler/who knows what. >=20 > 7:55PM up 6 mins, 2 users, load averages: 1.37, 0.91, 0.43 > USER TTY FROM LOGIN@ IDLE WHAT > root d0 - 7:55PM - w > rwatson p0 cboss 7:50PM 2 super-smack > select-key > hippy# top > calcru: negative time of 1834075 usec for pid 654 (super-smack) > ca~~ >=20 > In this case, I ran super-smack with the following parameters: >=20 > hippy:/usr/tmp/super-smack> super-smack select-key.smack 15 1000 >=20 > This generates 15 workers, which should cause mysql to spawn off threads > as well. I'm running with stock libpthread on this system (slightly old) > but an up-to-date kernel from CVS, GENERIC. >=20 > Has anyone else seen this? I saw calcru: negative time when I am having my system booted with a fresh built kernel and world: %uname -a FreeBSD beastie.frontfree.net 5.2-delphij FreeBSD 5.2-delphij #77: Sun Jun = 20 21:58:10 CST 2004 root@:/usr/obj/usr/src/sys/BEASTIE i386 I have some local kernel modifications, which includes RFC3522 implementation ported from DragonFlyBSD (kern/68110 and some further patch), some filesystem modifications (bin/61981) and a PID allocation algorithm ported from NetBSD. However I believe these changes does not contribute to this situation. Additionally I have noticed that my system would silently freeze when encounting heavy load with HTT enabled. My CPU is Pentium4 2.8-E. It seems that I am not the only one who have problem with P4-2.8E, I think this should be taken into consideration, too. Cheers, --=20 Xin LI http://www.delphij.net/ See complete headers for GPG key and other information. --4Ckj6UjgE2iN1+kY Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFA1lYjOfuToMruuMARAtCMAJ45AgQhaUzJJRUp/xF4RnblnxCg7QCggUf8 Caaw2fIIIMypkleqVqNXhRI= =ZfdD -----END PGP SIGNATURE----- --4Ckj6UjgE2iN1+kY-- From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 05:01:42 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A518116A4CE; Mon, 21 Jun 2004 05:01:42 +0000 (GMT) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2962C43D58; Mon, 21 Jun 2004 05:01:42 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i5L51e5v008152; Mon, 21 Jun 2004 15:01:41 +1000 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i5L51cnl031991; Mon, 21 Jun 2004 15:01:39 +1000 Date: Mon, 21 Jun 2004 15:01:38 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Don Lewis In-Reply-To: <200406210258.i5L2wTKF047144@gw.catspoiler.org> Message-ID: <20040621132119.Q8596@gamplex.bde.org> References: <200406210258.i5L2wTKF047144@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: rwatson@freebsd.org cc: current@freebsd.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 05:01:42 -0000 On Sun, 20 Jun 2004, Don Lewis wrote: > On 20 Jun, Robert Watson wrote: > > FYI, this is a Xeon box with two physical processors, each with two > > logical processors, and the problem could well be SMP-related. > > The hang might be SMP-related, but I just started getting the calcru > messages on my UP Athlon XP box. I just upgraded to today's > -CURRENT from a week-old version, and my console got spammed by a bunch > of these messages while I was running portupgrade. > > calcru: negative time of 0 usec for pid 21261 (sh) > calcru: negative time of 0 usec for pid 21261 (sh) > ... > calcru: negative time of 3917 usec for pid 49504 (sh) > calcru: negative time of 3917 usec for pid 49504 (sh) Hmm, these are nonnegative times. The extra messages in -current might be caused by fill_kinfo() now calling calcru(). Misreporting of negative times is by the following too-simple dianostic: % tu = (u_int64_t)tv.tv_sec * 1000000 + tv.tv_usec; % ptu = p->p_uu + p->p_su + p->p_iu; % if (tu < ptu || (int64_t)tu < 0) { % printf("calcru: negative time of %jd usec for pid %d (%s)\n", % (intmax_t)tu, p->p_pid, p->p_comm); % tu = ptu; % } The message is also printed for the tu < ptu case, which is what you are getting. I fixed the messages when I got a lot of them due to a local bug. The local bug was from double rounding for calcru() on child times (which -current doesn't do). I can't see how this could be the problem in -current, since the components of ptu are rounded down and there is a KASSERT that they added up to less than tu in the previous call. Ah, here is a likely cause of the bug in -current: % if (p == curthread->td_proc) { % /* % * Adjust for the current time slice. This is actually fairly % * important since the error here is on the order of a time % * quantum, which is much greater than the sampling error. % * XXXKSE use a different test due to threads on other % * processors also being 'current'. % */ % binuptime(&bt); % bintime_sub(&bt, PCPU_PTR(switchtime)); % bintime_add(&bt, &p->p_runtime); % } else % bt = p->p_runtime; The XXXKSE comment is correct that this might be broken. If the (p != curthread->td_proc) case happens at all for a running process, then it gives a wrong (out of date) timestamp in bt. This wrongness will be detected if calcru() is was called called earlier in the current timeslice and took the other path here. The recent change to fill_kinfo() is quite likely to trigger detection of this bug. fill_kinfo() is often used to iterate over all processes for ps, so it will call calcru() with (p != curthread->td_proc) for all processes other than the one running it, and give a bt that is out of date for all such processes that are actually running. Since there can be at most one running process per CPU, this bug only affects SMP. The call to calcru() from ttyinfo() may be the only other trigger. ttyinfo() picks a process and should rarely or never pick the ithread running it, so it will almost always take the (p != curthread->td_proc) path. Again, this is only a problem for the SMP case since in the !SMP case the picked process must have been switched away from to run the ithread, so it cannot be running. Bruce From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 05:11:20 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1C24B16A4CE for ; Mon, 21 Jun 2004 05:11:20 +0000 (GMT) Received: from mail.mcneil.com (rrcs-west-24-199-45-54.biz.rr.com [24.199.45.54]) by mx1.FreeBSD.org (Postfix) with ESMTP id E4F5B43D58 for ; Mon, 21 Jun 2004 05:11:19 +0000 (GMT) (envelope-from sean@mcneil.com) Received: from localhost (localhost.mcneil.com [127.0.0.1]) by mail.mcneil.com (Postfix) with ESMTP id 43B73FD067 for ; Sun, 20 Jun 2004 22:11:19 -0700 (PDT) Received: from mail.mcneil.com ([127.0.0.1]) by localhost (server.mcneil.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 25599-06 for ; Sun, 20 Jun 2004 22:11:18 -0700 (PDT) Received: from [24.199.45.54] (mcneil.com [24.199.45.54]) by mail.mcneil.com (Postfix) with ESMTP id 53DAEFD03A for ; Sun, 20 Jun 2004 22:11:18 -0700 (PDT) From: Sean McNeil To: freebsd-threads@freebsd.org Content-Type: text/plain Message-Id: <1087794678.46146.4.camel@server.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Sun, 20 Jun 2004 22:11:18 -0700 Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at mcneil.com Subject: kill(pid,0) sends a signal or not? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 05:11:20 -0000 I'm trying to trace down an issue with kse threads and firefox. There is an odd "trick" I haven't seen before: // kill(pid,0) is a neat trick to check if a // process exists if (kill(pid, 0) == 0 || errno != ESRCH) Does this really work? It is kind of odd that it I appear to get a signal (if the traceback is accurate) with the signal set to 0: #10 0x0000000202bc7a80 in thr_resume_wrapper (sig=0, siginfo=0x4, ucp=0x7fffffffd4c0) at /usr/src/lib/libpthread/thread/thr_kern.c:1112 This later causes a sig 11 and the program core dumps. Any info on how threads are suppose to behave when a process does a kill(pid,0) would be greatly appreciated. Cheers, Sean From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 05:29:03 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0104416A4CE for ; Mon, 21 Jun 2004 05:29:03 +0000 (GMT) Received: from et.endace.com (et.endace.com [219.88.101.154]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3EEC643D46 for ; Mon, 21 Jun 2004 05:29:02 +0000 (GMT) (envelope-from koryn@endace.com) Received: from prefect.et.endace.com (prefect.et.endace.com [192.168.64.24]) by et.endace.com (8.12.11/8.12.11) with ESMTP id i5L5T0DV008317 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT); Mon, 21 Jun 2004 17:29:00 +1200 (NZST) Date: Mon, 21 Jun 2004 17:26:30 +1200 (NZST) From: Koryn Grant X-X-Sender: koryn@prefect.et.endace.com To: Sean McNeil In-Reply-To: <1087794678.46146.4.camel@server.mcneil.com> Message-ID: References: <1087794678.46146.4.camel@server.mcneil.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: milter-gris/0.1.14 (et.endace.com [192.168.64.254]); Mon, 21 Jun 2004 17:29:00 +1200 X-Virus-Scanned: clamd / ClamAV version devel-20040611, clamav-milter version 0.72a on et.endace.com X-Virus-Status: Clean cc: freebsd-threads@freebsd.org Subject: Re: kill(pid,0) sends a signal or not? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 05:29:03 -0000 On Sun, 20 Jun 2004, Sean McNeil wrote: > Any info on how threads are suppose to behave when a process does a > kill(pid,0) would be greatly appreciated. The Single Unix Specification speaks thusly about kill(): "If sig is 0 (the null signal), error checking is performed but no signal is actually sent. The null signal can be used to check the validity of pid." Cheers, Koryn From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 05:41:22 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CEDC016A4CE for ; Mon, 21 Jun 2004 05:41:22 +0000 (GMT) Received: from mail.mcneil.com (rrcs-west-24-199-45-54.biz.rr.com [24.199.45.54]) by mx1.FreeBSD.org (Postfix) with ESMTP id BEEB643D54 for ; Mon, 21 Jun 2004 05:41:22 +0000 (GMT) (envelope-from sean@mcneil.com) Received: from localhost (localhost.mcneil.com [127.0.0.1]) by mail.mcneil.com (Postfix) with ESMTP id 28EFDFD067; Sun, 20 Jun 2004 22:41:22 -0700 (PDT) Received: from mail.mcneil.com ([127.0.0.1]) by localhost (server.mcneil.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 25599-07; Sun, 20 Jun 2004 22:41:21 -0700 (PDT) Received: from [24.199.45.54] (mcneil.com [24.199.45.54]) by mail.mcneil.com (Postfix) with ESMTP id A5DF2FD04C; Sun, 20 Jun 2004 22:41:21 -0700 (PDT) From: Sean McNeil To: Koryn Grant In-Reply-To: References: <1087794678.46146.4.camel@server.mcneil.com> Content-Type: text/plain Message-Id: <1087796481.46307.1.camel@server.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Sun, 20 Jun 2004 22:41:21 -0700 Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at mcneil.com cc: freebsd-threads@freebsd.org Subject: Re: kill(pid,0) sends a signal or not? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 05:41:22 -0000 On Sun, 2004-06-20 at 22:26, Koryn Grant wrote: > On Sun, 20 Jun 2004, Sean McNeil wrote: > > > Any info on how threads are suppose to behave when a process does a > > kill(pid,0) would be greatly appreciated. > > The Single Unix Specification speaks thusly about kill(): > > "If sig is 0 (the null signal), error checking is performed but no signal is > actually sent. The null signal can be used to check the validity of pid." Thanks, Koryn. It looks like what I'm seeing with the 0 value is how kse/pthread is waking up another thread through signalcontext. All looks legit. I thought that it was getting there from a kill(). Cheers, Sean From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 07:02:17 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6D94016A4CE; Mon, 21 Jun 2004 07:02:17 +0000 (GMT) Received: from www.mmlab.cse.yzu.edu.tw (www.mmlab.cse.yzu.edu.tw [140.138.145.166]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3719943D58; Mon, 21 Jun 2004 07:02:17 +0000 (GMT) (envelope-from avatar@mmlab.cse.yzu.edu.tw) Received: by www.mmlab.cse.yzu.edu.tw (qmail, from userid 1000) id 5689E4EFCD9; Mon, 21 Jun 2004 15:02:02 +0800 (CST) Received: from localhost (localhost [127.0.0.1]) by www.mmlab.cse.yzu.edu.tw (qmail) with ESMTP id 510DB4EFCD3; Mon, 21 Jun 2004 15:02:02 +0800 (CST) Date: Mon, 21 Jun 2004 15:02:02 +0800 (CST) From: Tai-hwa Liang To: Don Lewis In-Reply-To: <200406210258.i5L2wTKF047144@gw.catspoiler.org> Message-ID: <040621144707D.31719@www.mmlab.cse.yzu.edu.tw> References: <200406210258.i5L2wTKF047144@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@FreeBSD.org cc: rwatson@FreeBSD.org cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 07:02:17 -0000 On Sun, 20 Jun 2004, Don Lewis wrote: > On 20 Jun, Robert Watson wrote: > > FYI, this is a Xeon box with two physical processors, each with two > > logical processors, and the problem could well be SMP-related. > > The hang might be SMP-related, but I just started getting the calcru > messages on my UP Athlon XP box. I just upgraded to today's > -CURRENT from a week-old version, and my console got spammed by a bunch > of these messages while I was running portupgrade. Same here, though it's not a SMP box(Pentium 4 2.53GHz with HTT enabled, but only one CPU in hw.ncpu, no "options SMP" in kernel configuration). After install the latest kernel(cvsup'ed about 5 hours ago) and reboot, a lot of "calcru: negative time of...." messages popped on the console: calcru: negative time of 3419 usec for pid 557 (sh) [... repeated 230+ times] calcru: negative time of 3419 usec for pid 557 (sh) calcru: negative time of 3449 usec for pid 7226 (sh) [... repeated 70 times] calcru: negative time of 3449 usec for pid 7226 (sh) Last known working kernel was cvsup'ed on Jun-17-2004. It turns out that the system doesn't freeze and still works(able to build mozilla firefox increamentally without crash) at this moment. > > calcru: negative time of 0 usec for pid 21261 (sh) > calcru: negative time of 0 usec for pid 21261 (sh) [...] From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 07:44:37 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0B5A916A4CE; Mon, 21 Jun 2004 07:44:37 +0000 (GMT) Received: from rwcrmhc12.comcast.net (rwcrmhc12.comcast.net [216.148.227.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5E10C43D55; Mon, 21 Jun 2004 07:44:35 +0000 (GMT) (envelope-from julian@elischer.org) Received: from interjet.elischer.org ([24.7.73.28]) by comcast.net (rwcrmhc12) with ESMTP id <200406210744120140020tn8e>; Mon, 21 Jun 2004 07:44:12 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id AAA30764; Mon, 21 Jun 2004 00:44:10 -0700 (PDT) Date: Mon, 21 Jun 2004 00:44:09 -0700 (PDT) From: Julian Elischer To: Bruce Evans In-Reply-To: <20040621132119.Q8596@gamplex.bde.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: Don Lewis cc: rwatson@freebsd.org cc: current@freebsd.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 07:44:37 -0000 On Mon, 21 Jun 2004, Bruce Evans wrote: > Ah, here is a likely cause of the bug in -current: > > % if (p == curthread->td_proc) { > % /* > % * Adjust for the current time slice. This is actually fairly > % * important since the error here is on the order of a time > % * quantum, which is much greater than the sampling error. > % * XXXKSE use a different test due to threads on other > % * processors also being 'current'. > % */ > % binuptime(&bt); > % bintime_sub(&bt, PCPU_PTR(switchtime)); > % bintime_add(&bt, &p->p_runtime); > % } else > % bt = p->p_runtime; > > The XXXKSE comment is correct that this might be broken. If the (p > != curthread->td_proc) case happens at all for a running process, then > it gives a wrong (out of date) timestamp in bt. This wrongness will > be detected if calcru() is was called called earlier in the current > timeslice and took the other path here. It should be fairly easy as there is now a thread state that indicates that it is actually running now.. > > The recent change to fill_kinfo() is quite likely to trigger detection > of this bug. fill_kinfo() is often used to iterate over all processes > for ps, so it will call calcru() with (p != curthread->td_proc) for > all processes other than the one running it, and give a bt that is out > of date for all such processes that are actually running. Since there > can be at most one running process per CPU, this bug only affects SMP. > > The call to calcru() from ttyinfo() may be the only other trigger. > ttyinfo() picks a process and should rarely or never pick the ithread > running it, so it will almost always take the (p != curthread->td_proc) > path. Again, this is only a problem for the SMP case since in the !SMP > case the picked process must have been switched away from to run the > ithread, so it cannot be running. > > Bruce > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 08:01:48 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AE8A216A4E0; Mon, 21 Jun 2004 08:01:48 +0000 (GMT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 453E443D2F; Mon, 21 Jun 2004 08:01:48 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.11/8.12.11) with ESMTP id i5L81K5u047553; Mon, 21 Jun 2004 01:01:25 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200406210801.i5L81K5u047553@gw.catspoiler.org> Date: Mon, 21 Jun 2004 01:01:20 -0700 (PDT) From: Don Lewis To: bde@zeta.org.au In-Reply-To: <20040621132119.Q8596@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: threads@FreeBSD.org cc: rwatson@FreeBSD.org cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 08:01:49 -0000 On 21 Jun, Bruce Evans wrote: > On Sun, 20 Jun 2004, Don Lewis wrote: > >> On 20 Jun, Robert Watson wrote: >> > FYI, this is a Xeon box with two physical processors, each with two >> > logical processors, and the problem could well be SMP-related. >> >> The hang might be SMP-related, but I just started getting the calcru >> messages on my UP Athlon XP box. I just upgraded to today's >> -CURRENT from a week-old version, and my console got spammed by a bunch >> of these messages while I was running portupgrade. >> >> calcru: negative time of 0 usec for pid 21261 (sh) >> calcru: negative time of 0 usec for pid 21261 (sh) >> ... >> calcru: negative time of 3917 usec for pid 49504 (sh) >> calcru: negative time of 3917 usec for pid 49504 (sh) > > Hmm, these are nonnegative times. > > The extra messages in -current might be caused by fill_kinfo() now calling > calcru(). > > Misreporting of negative times is by the following too-simple dianostic: > > % tu = (u_int64_t)tv.tv_sec * 1000000 + tv.tv_usec; > % ptu = p->p_uu + p->p_su + p->p_iu; > % if (tu < ptu || (int64_t)tu < 0) { > % printf("calcru: negative time of %jd usec for pid %d (%s)\n", > % (intmax_t)tu, p->p_pid, p->p_comm); > % tu = ptu; > % } > > The message is also printed for the tu < ptu case, which is what you are > getting. > > I fixed the messages when I got a lot of them due to a local bug. The local > bug was from double rounding for calcru() on child times (which -current > doesn't do). I can't see how this could be the problem in -current, since > the components of ptu are rounded down and there is a KASSERT that they > added up to less than tu in the previous call. > > Ah, here is a likely cause of the bug in -current: > > % if (p == curthread->td_proc) { > % /* > % * Adjust for the current time slice. This is actually fairly > % * important since the error here is on the order of a time > % * quantum, which is much greater than the sampling error. > % * XXXKSE use a different test due to threads on other > % * processors also being 'current'. > % */ > % binuptime(&bt); > % bintime_sub(&bt, PCPU_PTR(switchtime)); > % bintime_add(&bt, &p->p_runtime); > % } else > % bt = p->p_runtime; > > The XXXKSE comment is correct that this might be broken. If the (p > != curthread->td_proc) case happens at all for a running process, then > it gives a wrong (out of date) timestamp in bt. This wrongness will > be detected if calcru() is was called called earlier in the current > timeslice and took the other path here. > > The recent change to fill_kinfo() is quite likely to trigger detection > of this bug. fill_kinfo() is often used to iterate over all processes > for ps, so it will call calcru() with (p != curthread->td_proc) for > all processes other than the one running it, and give a bt that is out > of date for all such processes that are actually running. Since there > can be at most one running process per CPU, this bug only affects SMP. > > The call to calcru() from ttyinfo() may be the only other trigger. > ttyinfo() picks a process and should rarely or never pick the ithread > running it, so it will almost always take the (p != curthread->td_proc) > path. Again, this is only a problem for the SMP case since in the !SMP > case the picked process must have been switched away from to run the > ithread, so it cannot be running. There must be some !SMP trigger for this as well. I just checked and I was able to trigger this on my Pentium-M laptop as well by leaning on the ^T key while I was logged on via ssh and running 'portupgrade -aP'. Jun 21 00:41:31 hairball kernel: calcru: negative time of 23169 usec for pid 44653 (sh) Jun 21 00:41:32 hairball kernel: calcru: negative time of 21990 usec for pid 44665 (sh) I didn't use ^T on my Athlon box. I might have had top running, though. It's interesting that this bug only seems to get triggered on /bin/sh. Maybe it is fork()/exit()/wait() related? From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 08:05:36 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9791916A4CE; Mon, 21 Jun 2004 08:05:36 +0000 (GMT) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 161B043D58; Mon, 21 Jun 2004 08:05:36 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i5L8555v031808; Mon, 21 Jun 2004 18:05:05 +1000 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i5L853nl023076; Mon, 21 Jun 2004 18:05:03 +1000 Date: Mon, 21 Jun 2004 18:05:02 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Julian Elischer In-Reply-To: Message-ID: <20040621174821.B979@gamplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: Don Lewis cc: rwatson@freebsd.org cc: current@freebsd.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 08:05:36 -0000 On Mon, 21 Jun 2004, Julian Elischer wrote: > On Mon, 21 Jun 2004, Bruce Evans wrote: > > > Ah, here is a likely cause of the bug in -current: > > > > % if (p == curthread->td_proc) { > > % /* > > % * Adjust for the current time slice. This is actually fairly > > % * important since the error here is on the order of a time > > % * quantum, which is much greater than the sampling error. > > % * XXXKSE use a different test due to threads on other > > % * processors also being 'current'. > > % */ > > % binuptime(&bt); > > % bintime_sub(&bt, PCPU_PTR(switchtime)); > > % bintime_add(&bt, &p->p_runtime); > > % } else > > % bt = p->p_runtime; > > > > The XXXKSE comment is correct that this might be broken. If the (p > > != curthread->td_proc) case happens at all for a running process, then > > it gives a wrong (out of date) timestamp in bt. This wrongness will > > be detected if calcru() is was called called earlier in the current > > timeslice and took the other path here. > > It should be fairly easy as there is now a thread state that indicates > that it is actually running now.. It's not so easy [to fix] since the switchtime for threads running on other CPUs is inaccessible (it is in the CPU's pcpu data). The bug seems to be unrelated to KSE. It is related to SMP. RELENG_4 has the bug, and pre-KSE versions have a proc state that indicates if we have a running process which can't be handled right. I will turn off the check in the known broken case, and maybe change the printf() to a log() since the error is not very important and syscons's console output routine is suspect when called with sched_lock held. Bruce From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 10:24:02 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0092E16A4CE; Mon, 21 Jun 2004 10:24:02 +0000 (GMT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6F9BC43D53; Mon, 21 Jun 2004 10:24:01 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.11/8.12.11) with ESMTP id i5LANcmF048049; Mon, 21 Jun 2004 03:23:43 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200406211023.i5LANcmF048049@gw.catspoiler.org> Date: Mon, 21 Jun 2004 03:23:38 -0700 (PDT) From: Don Lewis To: bde@zeta.org.au In-Reply-To: <200406210801.i5L81K5u047553@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: threads@FreeBSD.org cc: rwatson@FreeBSD.org cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 10:24:02 -0000 On 21 Jun, Don Lewis wrote: > On 21 Jun, Bruce Evans wrote: >> On Sun, 20 Jun 2004, Don Lewis wrote: >> >>> On 20 Jun, Robert Watson wrote: >>> > FYI, this is a Xeon box with two physical processors, each with two >>> > logical processors, and the problem could well be SMP-related. >>> >>> The hang might be SMP-related, but I just started getting the calcru >>> messages on my UP Athlon XP box. I just upgraded to today's >>> -CURRENT from a week-old version, and my console got spammed by a bunch >>> of these messages while I was running portupgrade. >>> >>> calcru: negative time of 0 usec for pid 21261 (sh) >>> calcru: negative time of 0 usec for pid 21261 (sh) >>> ... >>> calcru: negative time of 3917 usec for pid 49504 (sh) >>> calcru: negative time of 3917 usec for pid 49504 (sh) >> >> Hmm, these are nonnegative times. >> >> The extra messages in -current might be caused by fill_kinfo() now calling >> calcru(). >> >> Misreporting of negative times is by the following too-simple dianostic: >> >> % tu = (u_int64_t)tv.tv_sec * 1000000 + tv.tv_usec; >> % ptu = p->p_uu + p->p_su + p->p_iu; >> % if (tu < ptu || (int64_t)tu < 0) { >> % printf("calcru: negative time of %jd usec for pid %d (%s)\n", >> % (intmax_t)tu, p->p_pid, p->p_comm); >> % tu = ptu; >> % } >> >> The message is also printed for the tu < ptu case, which is what you are >> getting. >> >> I fixed the messages when I got a lot of them due to a local bug. The local >> bug was from double rounding for calcru() on child times (which -current >> doesn't do). I can't see how this could be the problem in -current, since >> the components of ptu are rounded down and there is a KASSERT that they >> added up to less than tu in the previous call. >> >> Ah, here is a likely cause of the bug in -current: >> >> % if (p == curthread->td_proc) { >> % /* >> % * Adjust for the current time slice. This is actually fairly >> % * important since the error here is on the order of a time >> % * quantum, which is much greater than the sampling error. >> % * XXXKSE use a different test due to threads on other >> % * processors also being 'current'. >> % */ >> % binuptime(&bt); >> % bintime_sub(&bt, PCPU_PTR(switchtime)); >> % bintime_add(&bt, &p->p_runtime); >> % } else >> % bt = p->p_runtime; >> >> The XXXKSE comment is correct that this might be broken. If the (p >> != curthread->td_proc) case happens at all for a running process, then >> it gives a wrong (out of date) timestamp in bt. This wrongness will >> be detected if calcru() is was called called earlier in the current >> timeslice and took the other path here. >> >> The recent change to fill_kinfo() is quite likely to trigger detection >> of this bug. fill_kinfo() is often used to iterate over all processes >> for ps, so it will call calcru() with (p != curthread->td_proc) for >> all processes other than the one running it, and give a bt that is out >> of date for all such processes that are actually running. Since there >> can be at most one running process per CPU, this bug only affects SMP. >> >> The call to calcru() from ttyinfo() may be the only other trigger. >> ttyinfo() picks a process and should rarely or never pick the ithread >> running it, so it will almost always take the (p != curthread->td_proc) >> path. Again, this is only a problem for the SMP case since in the !SMP >> case the picked process must have been switched away from to run the >> ithread, so it cannot be running. It looks like another way to trigger this in the SMP case would be to have two threads of the same process running at the same time, and for the second thread to call calcru() to have been running for a shorter period of time than when the first thread called calcru(). In the SMP cases, it probably makes sense to just silently to do if (tu < ptu) tu = ptu because of the complications of attempting to do an accurate calculation. > There must be some !SMP trigger for this as well. I just checked and I > was able to trigger this on my Pentium-M laptop as well by leaning on > the ^T key while I was logged on via ssh and running 'portupgrade -aP'. > > Jun 21 00:41:31 hairball kernel: calcru: negative time of 23169 usec for pid 44653 (sh) > Jun 21 00:41:32 hairball kernel: calcru: negative time of 21990 usec for pid 44665 (sh) > > I didn't use ^T on my Athlon box. I might have had top running, though. > > It's interesting that this bug only seems to get triggered on /bin/sh. > Maybe it is fork()/exit()/wait() related? It looks like the bug is in the exit code. I tweaked the printf() in calcru() to print out p_state, p_flag, and p_sflag in addition to the other info. In all cases, the processes that trigger the printf were zombies, and show up as [running] in ttyinfo() on a uniprocessor box. Jun 21 03:17:03 hairball kernel: calcru: negative time of 179 usec for pid 4543 (sh) p_state=0x2 p_flag=0x2002 p_sflag=0x1 load: 0.71 cmd: sh 4543 [running] 0.00u 0.00s 3% 0k p_runtime only gets updated in mi_switch(), and it appears that it never gets updated after the calcru() call in exit1(). It also looks like a bug that a zombie remains in the [running] state and thus looks interesting to ttyinfo(). From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 11:02:17 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3303616A4D1 for ; Mon, 21 Jun 2004 11:02:17 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 16C4343D1F for ; Mon, 21 Jun 2004 11:02:17 +0000 (GMT) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (peter@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.11/8.12.11) with ESMTP id i5LB21tC064735 for ; Mon, 21 Jun 2004 11:02:01 GMT (envelope-from owner-bugmaster@freebsd.org) Received: (from peter@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i5LB20d7064729 for freebsd-threads@freebsd.org; Mon, 21 Jun 2004 11:02:00 GMT (envelope-from owner-bugmaster@freebsd.org) Date: Mon, 21 Jun 2004 11:02:00 GMT Message-Id: <200406211102.i5LB20d7064729@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: peter set sender to owner-bugmaster@freebsd.org using -f From: FreeBSD bugmaster To: freebsd-threads@FreeBSD.org Subject: Current problem reports assigned to you X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 11:02:17 -0000 Current FreeBSD problem reports Critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/06/13] kern/19247 threads uthread_sigaction.c does not do anything s [2004/03/15] kern/64313 threads FreeBSD (OpenBSD) pthread implicit set/un o [2004/04/22] threads/65883threads libkse's sigwait does not work after fork 3 problems total. Serious problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/07/18] kern/20016 threads pthreads: Cannot set scheduling timer/Can o [2000/08/26] misc/20861 threads libc_r does not honor socket timeouts o [2001/01/20] bin/24472 threads libc_r does not honor SO_SNDTIMEO/SO_RCVT o [2001/01/25] bin/24632 threads libc_r delicate deviation from libc in ha o [2001/01/25] misc/24641 threads pthread_rwlock_rdlock can deadlock o [2001/11/26] bin/32295 threads pthread dont dequeue signals o [2002/02/01] i386/34536 threads accept() blocks other threads o [2002/05/25] kern/38549 threads the procces compiled whith pthread stoppe o [2002/06/27] bin/39922 threads [PATCH?] Threaded applications executed w o [2002/08/04] misc/41331 threads Pthread library open sets O_NONBLOCK flag o [2003/03/02] bin/48856 threads Setting SIGCHLD to SIG_IGN still leaves z o [2003/03/10] bin/49087 threads Signals lost in programs linked with libc o [2003/05/08] bin/51949 threads thread in accept cannot be cancelled 13 problems total. Non-critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/05/26] misc/18824 threads gethostbyname is not thread safe o [2000/10/21] misc/22190 threads A threaded read(2) from a socketpair(2) f o [2001/09/09] bin/30464 threads pthread mutex attributes -- pshared o [2002/05/02] bin/37676 threads libc_r: msgsnd(), msgrcv(), pread(), pwri s [2002/07/16] misc/40671 threads pthread_cancel doesn't remove thread from 5 problems total. From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 12:30:58 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C92F116A4CE; Mon, 21 Jun 2004 12:30:58 +0000 (GMT) Received: from smarthost.enta.net (smarthost.enta.net [195.74.97.231]) by mx1.FreeBSD.org (Postfix) with ESMTP id 868EE43D1F; Mon, 21 Jun 2004 12:30:58 +0000 (GMT) (envelope-from jacs@gnome.co.uk) Received: from smartsmtp.enta.net (smtp.enta.net [195.74.97.230]) by smarthost.enta.net (Postfix) with ESMTP id 8240F17DB; Mon, 21 Jun 2004 13:34:20 +0100 (BST) Received: from smtp.enta.net (localhost [127.0.0.1]) by smartsmtp.enta.net (8.12.3/8.12.3) with ESMTP id i5LCmNl9089013; Mon, 21 Jun 2004 13:48:24 +0100 (BST) (envelope-from jacs@gnome.co.uk) Received: from hawk.gnome.co.uk (81-31-113-153.adsl.entanet.co.uk [81.31.113.153]) by smtp.enta.net (Postfix) with SMTP id 399B89681E; Mon, 21 Jun 2004 13:48:23 +0100 (BST) Received: from kite (kite.gnome.co.uk [192.168.123.75]) by hawk.gnome.co.uk (8.12.10/8.12.10) with SMTP id i5LCUht1005584; Mon, 21 Jun 2004 13:30:43 +0100 (BST) (envelope-from jacs@gnome.co.uk) Message-ID: <011f01c4578b$923d7b70$4b7ba8c0@gnome.co.uk> From: "Chris Stenton" To: Date: Mon, 21 Jun 2004 13:30:43 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1409 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409 X-Scanned-By: MIMEDefang 2.43 cc: hackers@freebsd.org Subject: pthread - fork - execv problem X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 12:30:58 -0000 I am trying to help port over an app thats posix threaded. One thread uses fork,dup2 and execv to start a child programme in this case an mp3 player. However, under FreeBSD-5.2.1, the execv causes all the threads in the parent process to be blocked until the child process returns. Is there a mechanism to get around this. Thanks Chris From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 12:53:53 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6ED1716A4CE; Mon, 21 Jun 2004 12:53:53 +0000 (GMT) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id E944843D2F; Mon, 21 Jun 2004 12:53:52 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.0.86])i5LCrn5v009203; Mon, 21 Jun 2004 22:53:49 +1000 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i5LCrlao031830; Mon, 21 Jun 2004 22:53:47 +1000 Date: Mon, 21 Jun 2004 22:52:18 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Don Lewis In-Reply-To: <200406211023.i5LANcmF048049@gw.catspoiler.org> Message-ID: <20040621220455.T9194@gamplex.bde.org> References: <200406211023.i5LANcmF048049@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@FreeBSD.org cc: rwatson@FreeBSD.org cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 12:53:53 -0000 On Mon, 21 Jun 2004, Don Lewis wrote: > On 21 Jun, Don Lewis wrote: > > On 21 Jun, Bruce Evans wrote: > >> Ah, here is a likely cause of the bug in -current: > >> > >> % if (p == curthread->td_proc) { > >> % /* > >> % * Adjust for the current time slice. This is actually fairly > >> % * important since the error here is on the order of a time > >> % * quantum, which is much greater than the sampling error. > >> % * XXXKSE use a different test due to threads on other > >> % * processors also being 'current'. > >> % */ > >> % binuptime(&bt); > >> % bintime_sub(&bt, PCPU_PTR(switchtime)); > >> % bintime_add(&bt, &p->p_runtime); > >> % } else > >> % bt = p->p_runtime; > >> > >> The XXXKSE comment is correct that this might be broken. If the (p > >> ... > > It looks like another way to trigger this in the SMP case would be to > have two threads of the same process running at the same time, and for > the second thread to call calcru() to have been running for a shorter > period of time than when the first thread called calcru(). I think my test for the problem case covers this. It's necessary to add the runtime for the current slice for all threads in the process, but the above adds it for at most one thread. I set a flag to indicate that there is a problem if any running thread in the process can't be handled: %%% rt = p->p_runtime; problemcase = 0; FOREACH_THREAD_IN_PROC(p, td) { /* * Adjust for the current time slice. This is actually fairly * important since the error here is on the order of a time * quantum, which is much greater than the sampling error. */ if (td == curthread) { binuptime(&bt); bintime_sub(&bt, PCPU_PTR(switchtime)); bintime_add(&rt, &bt); } else { /* * This case should add the current time less the * switch time as above, but the switch time is * inaccessible. So we might end up with rt too * small and then the monotonicity check might detect * the problem. Just set a flag to avoid warning * about this known problem. */ problemcase = 1; } } %%% Oops, this is missing the critical TD_IS_RUNNING(td) condition for setting problemcase. > In the SMP cases, it probably makes sense to just silently to do > if (tu < ptu) > tu = ptu > because of the complications of attempting to do an accurate > calculation. I'm tempted to do that because the debugging code is so ugly, but the diagnostic has been useful for finding bugs that weren't all there when it was written, so I'd prefer not to remove it. > > There must be some !SMP trigger for this as well. I just checked and I > > was able to trigger this on my Pentium-M laptop as well by leaning on > > the ^T key while I was logged on via ssh and running 'portupgrade -aP'. > > > > Jun 21 00:41:31 hairball kernel: calcru: negative time of 23169 usec for pid 44653 (sh) > > Jun 21 00:41:32 hairball kernel: calcru: negative time of 21990 usec for pid 44665 (sh) > > > > I didn't use ^T on my Athlon box. I might have had top running, though. > > > > It's interesting that this bug only seems to get triggered on /bin/sh. > > Maybe it is fork()/exit()/wait() related? > > It looks like the bug is in the exit code. I tweaked the printf() in > calcru() to print out p_state, p_flag, and p_sflag in addition to the > other info. In all cases, the processes that trigger the printf were > zombies, and show up as [running] in ttyinfo() on a uniprocessor box. There is a PR about this (#52490). The oops in my test fixes it for bogus reasons. I'm currently adding similar printfs to help figure out what is going wrong. > Jun 21 03:17:03 hairball kernel: calcru: negative time of 179 usec for > pid 4543 (sh) p_state=0x2 p_flag=0x2002 p_sflag=0x1 > > load: 0.71 cmd: sh 4543 [running] 0.00u 0.00s 3% 0k > > p_runtime only gets updated in mi_switch(), and it appears that it never > gets updated after the calcru() call in exit1(). That explains the problem. The calcru() values sets up to date (final) values for the components of ptu. When we look at the process after it has become a zombie, we use only p_runtime since the process is not running, but p_runtime is stale. > It also looks like a bug that a zombie remains in the [running] state > and thus looks interesting to ttyinfo(). I think it isn't really running. ttyinfo() should pick it if it is the only process the terminal. ttyinfo() does pick it for the zombie in the test program in the PR, and reports that it is running, but ps reports it correctly as a zomble. Bruce From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 13:38:11 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D4C7916A4CE; Mon, 21 Jun 2004 13:38:11 +0000 (GMT) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5AD2543D1D; Mon, 21 Jun 2004 13:38:11 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i5LDcA4u019202; Mon, 21 Jun 2004 23:38:10 +1000 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i5LDc7nl002033; Mon, 21 Jun 2004 23:38:08 +1000 Date: Mon, 21 Jun 2004 23:38:06 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Don Lewis In-Reply-To: <20040621220455.T9194@gamplex.bde.org> Message-ID: <20040621232654.S873@gamplex.bde.org> References: <200406211023.i5LANcmF048049@gw.catspoiler.org> <20040621220455.T9194@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@FreeBSD.org cc: rwatson@FreeBSD.org cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 13:38:12 -0000 On Mon, 21 Jun 2004, Bruce Evans wrote: > On Mon, 21 Jun 2004, Don Lewis wrote: > > It looks like the bug is in the exit code. I tweaked the printf() in > > calcru() to print out p_state, p_flag, and p_sflag in addition to the > > other info. In all cases, the processes that trigger the printf were > > zombies, and show up as [running] in ttyinfo() on a uniprocessor box. > > There is a PR about this (#52490). The oops in my test fixes it for bogus > reasons. I'm currently adding similar printfs to help figure out what is > going wrong. > > > Jun 21 03:17:03 hairball kernel: calcru: negative time of 179 usec for > > pid 4543 (sh) p_state=0x2 p_flag=0x2002 p_sflag=0x1 > > > > load: 0.71 cmd: sh 4543 [running] 0.00u 0.00s 3% 0k > > > > p_runtime only gets updated in mi_switch(), and it appears that it never > > gets updated after the calcru() call in exit1(). > > That explains the problem. The calcru() values sets up to date (final) > values for the components of ptu. When we look at the process after it > has become a zombie, we use only p_runtime since the process is not > running, but p_runtime is stale. I happened to have fixed this already without really noticing. (My kernel doesn't call calcru() in exit1() or in wait1(); it accumulates p_runtime and tick counts (instead of those values converted to timevals by calcru()) in the child stats, so it has to get the final p_runtime right.) %%% Index: kern_exit.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v retrieving revision 1.236 diff -u -2 -r1.236 kern_exit.c --- kern_exit.c 18 Jun 2004 11:13:49 -0000 1.236 +++ kern_exit.c 21 Jun 2004 13:04:31 -0000 @@ -104,4 +104,5 @@ exit1(struct thread *td, int rv) { + struct bintime new_switchtime; struct proc *p, *nq, *q; struct tty *tp; @@ -518,8 +519,14 @@ mtx_lock_spin(&sched_lock); critical_exit(); - cnt.v_swtch++; - binuptime(PCPU_PTR(switchtime)); + + /* Do the same timestamp bookkeeping that mi_switch() would do. */ + binuptime(&new_switchtime); + bintime_add(&p->p_runtime, &new_switchtime); + bintime_sub(&p->p_runtime, PCPU_PTR(switchtime)); + PCPU_SET(switchtime, new_switchtime); PCPU_SET(switchticks, ticks); + cnt.v_swtch++; + /* * Allow the scheduler to adjust the priority of the %%% I will commit this soon. Workaround for the main problem: %%% Index: kern_resource.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_resource.c,v retrieving revision 1.139 diff -u -2 -r1.139 kern_resource.c --- kern_resource.c 16 Jun 2004 00:26:29 -0000 1.139 +++ kern_resource.c 21 Jun 2004 12:55:24 -0000 @@ -702,8 +702,10 @@ struct timeval *ip; { - struct bintime bt; + struct bintime bt, rt; struct timeval tv; + struct thread *td; /* {user, system, interrupt, total} {ticks, usec}; previous tu: */ u_int64_t ut, uu, st, su, it, iu, tt, tu, ptu; + int problemcase; mtx_assert(&sched_lock, MA_OWNED); @@ -719,22 +721,40 @@ tt = 1; } - if (p == curthread->td_proc) { + rt = p->p_runtime; + problemcase = 0; + FOREACH_THREAD_IN_PROC(p, td) { /* * Adjust for the current time slice. This is actually fairly * important since the error here is on the order of a time * quantum, which is much greater than the sampling error. - * XXXKSE use a different test due to threads on other - * processors also being 'current'. */ - binuptime(&bt); - bintime_sub(&bt, PCPU_PTR(switchtime)); - bintime_add(&bt, &p->p_runtime); - } else - bt = p->p_runtime; - bintime2timeval(&bt, &tv); + if (td == curthread) { + binuptime(&bt); + bintime_sub(&bt, PCPU_PTR(switchtime)); + bintime_add(&rt, &bt); + } else if (TD_IS_RUNNING(td)) { + /* + * This case should add the current time less the + * switch time as above, but the switch time is + * inaccessible. So we might end up with rt too + * small and then the monotonicity check might detect + * the problem. Just set a flag to avoid warning + * about this known problem. + */ + problemcase = 1; + } + } + bintime2timeval(&rt, &tv); tu = (u_int64_t)tv.tv_sec * 1000000 + tv.tv_usec; ptu = p->p_uu + p->p_su + p->p_iu; - if (tu < ptu || (int64_t)tu < 0) { - printf("calcru: negative time of %jd usec for pid %d (%s)\n", + if (tu < ptu) { + if (!problemcase) + printf( +"calcru: runtime went backwards from %ju usec to %ju usec for pid %d (%s)\n", + (uintmax_t)ptu, (uintmax_t)tu, p->p_pid, p->p_comm); + tu = ptu; + } + if ((int64_t)tu < 0) { + printf("calcru: negative runtime of %jd usec for pid %d (%s)\n", (intmax_t)tu, p->p_pid, p->p_comm); tu = ptu; %%% Bruce From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 14:36:08 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4A3C816A4CE for ; Mon, 21 Jun 2004 14:36:08 +0000 (GMT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id E073943D5A for ; Mon, 21 Jun 2004 14:36:07 +0000 (GMT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id i5LEa4j0029368; Mon, 21 Jun 2004 10:36:06 -0400 (EDT) Date: Mon, 21 Jun 2004 10:36:04 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Sean McNeil In-Reply-To: <1087794678.46146.4.camel@server.mcneil.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-threads@freebsd.org Subject: Re: kill(pid,0) sends a signal or not? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 14:36:08 -0000 On Sun, 20 Jun 2004, Sean McNeil wrote: > I'm trying to trace down an issue with kse threads and firefox. There > is an odd "trick" I haven't seen before: > > // kill(pid,0) is a neat trick to check if a > // process exists > if (kill(pid, 0) == 0 || errno != ESRCH) > > Does this really work? It is kind of odd that it I appear to get a > signal (if the traceback is accurate) with the signal set to 0: > > #10 0x0000000202bc7a80 in thr_resume_wrapper (sig=0, siginfo=0x4, > ucp=0x7fffffffd4c0) at /usr/src/lib/libpthread/thread/thr_kern.c:1112 > > This later causes a sig 11 and the program core dumps. > > Any info on how threads are suppose to behave when a process does a > kill(pid,0) would be greatly appreciated. kill(pid, 0) shouldn't result in a signal. libpthread doesn't do anything with kill() and the kernel shouldn't cause a signal for 0 either. What does ktrace show? -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 19:07:58 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 746B216A4CE; Mon, 21 Jun 2004 19:07:58 +0000 (GMT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1126143D2D; Mon, 21 Jun 2004 19:07:58 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.11/8.12.11) with ESMTP id i5LJ7gRm049126; Mon, 21 Jun 2004 12:07:52 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200406211907.i5LJ7gRm049126@gw.catspoiler.org> Date: Mon, 21 Jun 2004 12:07:42 -0700 (PDT) From: Don Lewis To: bde@zeta.org.au In-Reply-To: <20040621220455.T9194@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: threads@FreeBSD.org cc: rwatson@FreeBSD.org cc: current@FreeBSD.org Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 19:07:58 -0000 On 21 Jun, Bruce Evans wrote: > On Mon, 21 Jun 2004, Don Lewis wrote: >> It also looks like a bug that a zombie remains in the [running] state >> and thus looks interesting to ttyinfo(). > > I think it isn't really running. ttyinfo() should pick it if it is the > only process the terminal. ttyinfo() does pick it for the zombie in the > test program in the PR, and reports that it is running, but ps reports > it correctly as a zomble. ttyinfo() prints "[running]" if TD_IS_RUNNING(td) is true. I think the problem is that thread_exit() doesn't set td_state to TDS_INACTIVE if the process only has one thread. From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 19:20:29 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F0D0916A4CE for ; Mon, 21 Jun 2004 19:20:29 +0000 (GMT) Received: from mail.mcneil.com (rrcs-west-24-199-45-54.biz.rr.com [24.199.45.54]) by mx1.FreeBSD.org (Postfix) with ESMTP id CEAE943D48 for ; Mon, 21 Jun 2004 19:20:27 +0000 (GMT) (envelope-from sean@mcneil.com) Received: from localhost (localhost.mcneil.com [127.0.0.1]) by mail.mcneil.com (Postfix) with ESMTP id 390E9FD076; Mon, 21 Jun 2004 12:20:27 -0700 (PDT) Received: from mail.mcneil.com ([127.0.0.1]) by localhost (server.mcneil.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 84820-07; Mon, 21 Jun 2004 12:20:26 -0700 (PDT) Received: from [24.199.45.54] (mcneil.com [24.199.45.54]) by mail.mcneil.com (Postfix) with ESMTP id C5813FD067; Mon, 21 Jun 2004 12:20:26 -0700 (PDT) From: Sean McNeil To: Daniel Eischen In-Reply-To: References: Content-Type: text/plain Message-Id: <1087845626.85957.1.camel@server.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Mon, 21 Jun 2004 12:20:26 -0700 Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at mcneil.com cc: freebsd-threads@freebsd.org Subject: Re: kill(pid,0) sends a signal or not? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 19:20:30 -0000 On Mon, 2004-06-21 at 07:36, Daniel Eischen wrote: > On Sun, 20 Jun 2004, Sean McNeil wrote: > > > I'm trying to trace down an issue with kse threads and firefox. There > > is an odd "trick" I haven't seen before: > > > > // kill(pid,0) is a neat trick to check if a > > // process exists > > if (kill(pid, 0) == 0 || errno != ESRCH) > > > > Does this really work? It is kind of odd that it I appear to get a > > signal (if the traceback is accurate) with the signal set to 0: > > > > #10 0x0000000202bc7a80 in thr_resume_wrapper (sig=0, siginfo=0x4, > > ucp=0x7fffffffd4c0) at /usr/src/lib/libpthread/thread/thr_kern.c:1112 > > > > This later causes a sig 11 and the program core dumps. > > > > Any info on how threads are suppose to behave when a process does a > > kill(pid,0) would be greatly appreciated. > > kill(pid, 0) shouldn't result in a signal. libpthread doesn't do > anything with kill() and the kernel shouldn't cause a signal for 0 > either. What does ktrace show? It wasn't generating a sig 0. What I was seeing was the inner workings of the threads where a "sig" variable was set to 0, but the actual signal was 11. Everything is working as designed as far as I can tell. From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 20:09:45 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5AE7316A503; Mon, 21 Jun 2004 20:09:44 +0000 (GMT) Received: from rwcrmhc11.comcast.net (rwcrmhc11.comcast.net [204.127.198.35]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1C21443D58; Mon, 21 Jun 2004 20:09:41 +0000 (GMT) (envelope-from julian@elischer.org) Received: from interjet.elischer.org ([24.7.73.28]) by comcast.net (rwcrmhc11) with ESMTP id <2004062120093901300gddhje>; Mon, 21 Jun 2004 20:09:40 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id NAA39622; Mon, 21 Jun 2004 13:09:33 -0700 (PDT) Date: Mon, 21 Jun 2004 13:09:30 -0700 (PDT) From: Julian Elischer To: Don Lewis In-Reply-To: <200406211907.i5LJ7gRm049126@gw.catspoiler.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@FreeBSD.org cc: rwatson@FreeBSD.org cc: current@FreeBSD.org cc: bde@zeta.org.au Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 20:09:45 -0000 On Mon, 21 Jun 2004, Don Lewis wrote: > On 21 Jun, Bruce Evans wrote: > > On Mon, 21 Jun 2004, Don Lewis wrote: > > >> It also looks like a bug that a zombie remains in the [running] state > >> and thus looks interesting to ttyinfo(). > > > > I think it isn't really running. ttyinfo() should pick it if it is the > > only process the terminal. ttyinfo() does pick it for the zombie in the > > test program in the PR, and reports that it is running, but ps reports > > it correctly as a zomble. > > ttyinfo() prints "[running]" if TD_IS_RUNNING(td) is true. I think the > problem is that thread_exit() doesn't set td_state to TDS_INACTIVE if > the process only has one thread. thanks.. I'll fix that... julian > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > From owner-freebsd-threads@FreeBSD.ORG Mon Jun 21 20:44:54 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A30C016A4CE; Mon, 21 Jun 2004 20:44:54 +0000 (GMT) Received: from rwcrmhc12.comcast.net (rwcrmhc12.comcast.net [216.148.227.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8F33E43D55; Mon, 21 Jun 2004 20:44:54 +0000 (GMT) (envelope-from julian@elischer.org) Received: from interjet.elischer.org ([24.7.73.28]) by comcast.net (rwcrmhc12) with ESMTP id <200406212044390140028gr9e>; Mon, 21 Jun 2004 20:44:40 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id NAA40350; Mon, 21 Jun 2004 13:44:37 -0700 (PDT) Date: Mon, 21 Jun 2004 13:44:36 -0700 (PDT) From: Julian Elischer To: Don Lewis In-Reply-To: <200406211907.i5LJ7gRm049126@gw.catspoiler.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@FreeBSD.org cc: rwatson@FreeBSD.org cc: current@FreeBSD.org cc: bde@zeta.org.au Subject: Re: calcru: negative time ... followed by freeze X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 20:44:54 -0000 On Mon, 21 Jun 2004, Don Lewis wrote: > On 21 Jun, Bruce Evans wrote: > > On Mon, 21 Jun 2004, Don Lewis wrote: > > >> It also looks like a bug that a zombie remains in the [running] state > >> and thus looks interesting to ttyinfo(). > > > > I think it isn't really running. ttyinfo() should pick it if it is the > > only process the terminal. ttyinfo() does pick it for the zombie in the > > test program in the PR, and reports that it is running, but ps reports > > it correctly as a zomble. > > ttyinfo() prints "[running]" if TD_IS_RUNNING(td) is true. I think the > problem is that thread_exit() doesn't set td_state to TDS_INACTIVE if > the process only has one thread. fixed.. thanks From owner-freebsd-threads@FreeBSD.ORG Tue Jun 22 13:53:48 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E455316A4DC for ; Tue, 22 Jun 2004 13:53:48 +0000 (GMT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6B8C343D5C for ; Tue, 22 Jun 2004 13:53:48 +0000 (GMT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id i5MDrIon001152; Tue, 22 Jun 2004 09:53:18 -0400 (EDT) Date: Tue, 22 Jun 2004 09:53:18 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Chris Stenton In-Reply-To: <011f01c4578b$923d7b70$4b7ba8c0@gnome.co.uk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org Subject: Re: pthread - fork - execv problem X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2004 13:53:49 -0000 On Mon, 21 Jun 2004, Chris Stenton wrote: > I am trying to help port over an app thats posix threaded. One thread uses > fork,dup2 and execv to start a child programme in this case an mp3 player. > However, under FreeBSD-5.2.1, the execv causes all the threads in the parent > process to be blocked until the child process returns. Is there a mechanism > to get around this. That shouldn't happen. What thread library? Sample program to demonstrate problem? -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Tue Jun 22 14:57:17 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 561EF16A4CE; Tue, 22 Jun 2004 14:57:17 +0000 (GMT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 127FB43D54; Tue, 22 Jun 2004 14:57:17 +0000 (GMT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.10/8.12.10) id i5MEuXN6077186; Tue, 22 Jun 2004 09:56:33 -0500 (CDT) (envelope-from dan) Date: Tue, 22 Jun 2004 09:56:33 -0500 From: Dan Nelson To: Chris Stenton Message-ID: <20040622145632.GF86471@dan.emsphone.com> References: <011f01c4578b$923d7b70$4b7ba8c0@gnome.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <011f01c4578b$923d7b70$4b7ba8c0@gnome.co.uk> X-OS: FreeBSD 5.2-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.6i cc: threads@freebsd.org cc: hackers@freebsd.org Subject: Re: pthread - fork - execv problem X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2004 14:57:17 -0000 In the last episode (Jun 21), Chris Stenton said: > I am trying to help port over an app thats posix threaded. One thread > uses fork, dup2 and execv to start a child programme in this case an > mp3 player. However, under FreeBSD-5.2.1, the execv causes all the > threads in the parent process to be blocked until the child process > returns. Is there a mechanism to get around this. Do you have a small testcase? I have not seen your problem in any other threaded programs on FreeBSD. It may be an application bug. After a fork both processes are independant. The child should not be able to affect the parent like this, unless the parent does something like holding a mutex used by all the threads and calling wait(). -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-threads@FreeBSD.ORG Tue Jun 22 18:26:54 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6A59D16A4CE; Tue, 22 Jun 2004 18:26:54 +0000 (GMT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 20C2F43D49; Tue, 22 Jun 2004 18:26:54 +0000 (GMT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.10/8.12.10) id i5MIQXdL017275; Tue, 22 Jun 2004 13:26:33 -0500 (CDT) (envelope-from dan) Date: Tue, 22 Jun 2004 13:26:33 -0500 From: Dan Nelson To: Chris Stenton Message-ID: <20040622182632.GJ86471@dan.emsphone.com> References: <011f01c4578b$923d7b70$4b7ba8c0@gnome.co.uk> <20040622145632.GF86471@dan.emsphone.com> <20040622154056.GA8733@diogenis.ceid.upatras.gr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040622154056.GA8733@diogenis.ceid.upatras.gr> X-OS: FreeBSD 5.2-CURRENT User-Agent: Mutt/1.5.6i cc: threads@freebsd.org cc: hackers@freebsd.org Subject: Re: pthread - fork - execv problem X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2004 18:26:54 -0000 In the last episode (Jun 22), Nikos Ntarmos said: > On Tue, Jun 22, 2004 at 09:56:33AM -0500, Dan Nelson wrote: > > It may be an application bug. After a fork both processes are > > independant. The child should not be able to affect the parent > > like this, unless the parent does something like holding a mutex > > used by all the threads and calling wait(). > > ... or the child holding a mutex before the fork(2) syscall. FWIW the > Linux info for libc and the NetBSD and Solaris man pages mention > pthread_atfork(3), used to install handlers to take care of such > cases. FreeBSD seems to not know of any such function, so chances are > that fork()'ing from inside a posix thread is not supported (?). It's definitely a possibility. libpthread in -current does support pthread_atfork, and I have a patch (below) that adds the same functionality to libc_r and libthr that I need to send-pr. Pointy hat to the original committer for breaking ABI compatibility. http://dan.allantgroup.com/FreeBSD/ -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-threads@FreeBSD.ORG Tue Jun 22 20:19:11 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6294416A4CE; Tue, 22 Jun 2004 20:19:11 +0000 (GMT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id E204F43D49; Tue, 22 Jun 2004 20:19:10 +0000 (GMT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id i5MKIpon018521; Tue, 22 Jun 2004 16:18:51 -0400 (EDT) Date: Tue, 22 Jun 2004 16:18:51 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Dan Nelson In-Reply-To: <20040622182632.GJ86471@dan.emsphone.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: hackers@freebsd.org cc: threads@freebsd.org cc: Chris Stenton Subject: Re: pthread - fork - execv problem X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2004 20:19:11 -0000 On Tue, 22 Jun 2004, Dan Nelson wrote: > In the last episode (Jun 22), Nikos Ntarmos said: > > On Tue, Jun 22, 2004 at 09:56:33AM -0500, Dan Nelson wrote: > > > It may be an application bug. After a fork both processes are > > > independant. The child should not be able to affect the parent > > > like this, unless the parent does something like holding a mutex > > > used by all the threads and calling wait(). > > > > ... or the child holding a mutex before the fork(2) syscall. FWIW the > > Linux info for libc and the NetBSD and Solaris man pages mention > > pthread_atfork(3), used to install handlers to take care of such > > cases. FreeBSD seems to not know of any such function, so chances are > > that fork()'ing from inside a posix thread is not supported (?). > > It's definitely a possibility. > > libpthread in -current does support pthread_atfork, and I have a patch > (below) that adds the same functionality to libc_r and libthr that I > need to send-pr. Pointy hat to the original committer for breaking ABI > compatibility. Whaa? Adding a function doesn't break ABI, and I don't want to maintain 3 thread libraries. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Tue Jun 22 21:08:40 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CB44916A4CF; Tue, 22 Jun 2004 21:08:40 +0000 (GMT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7DB5643D5A; Tue, 22 Jun 2004 21:08:40 +0000 (GMT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.10/8.12.10) id i5ML8L2G051110; Tue, 22 Jun 2004 16:08:21 -0500 (CDT) (envelope-from dan) Date: Tue, 22 Jun 2004 16:08:21 -0500 From: Dan Nelson To: Daniel Eischen Message-ID: <20040622210820.GA17392@dan.emsphone.com> References: <20040622182632.GJ86471@dan.emsphone.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-OS: FreeBSD 5.2-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.6i cc: hackers@freebsd.org cc: threads@freebsd.org cc: Chris Stenton Subject: Re: pthread - fork - execv problem X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2004 21:08:41 -0000 In the last episode (Jun 22), Daniel Eischen said: > > libpthread in -current does support pthread_atfork, and I have a > > patch (below) that adds the same functionality to libc_r and libthr > > that I need to send-pr. Pointy hat to the original committer for > > breaking ABI compatibility. http://dan.allantgroup.com/FreeBSD/ > > Whaa? Adding a function doesn't break ABI, and I don't want to > maintain 3 thread libraries. It does if an application detects pthread_fork during configure and uses it. You then can't use libmap to redirect libpthread to one of the other thread libraries for testing, since you'll get an undefined symbol error at runtime. Nikos Ntarmos also noticed that there's no pthread_atfork manpage. We could probably just use the Single Unix one. -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-threads@FreeBSD.ORG Tue Jun 22 21:50:53 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8E11316A4CE for ; Tue, 22 Jun 2004 21:50:53 +0000 (GMT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2F93643D2D for ; Tue, 22 Jun 2004 21:50:53 +0000 (GMT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id i5MLoYon005443; Tue, 22 Jun 2004 17:50:34 -0400 (EDT) Date: Tue, 22 Jun 2004 17:50:34 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Dan Nelson In-Reply-To: <20040622210820.GA17392@dan.emsphone.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: Chris Stenton Subject: Re: pthread - fork - execv problem X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2004 21:50:53 -0000 On Tue, 22 Jun 2004, Dan Nelson wrote: > In the last episode (Jun 22), Daniel Eischen said: > > > libpthread in -current does support pthread_atfork, and I have a > > > patch (below) that adds the same functionality to libc_r and libthr > > > that I need to send-pr. Pointy hat to the original committer for > > > breaking ABI compatibility. http://dan.allantgroup.com/FreeBSD/ > > > > Whaa? Adding a function doesn't break ABI, and I don't want to > > maintain 3 thread libraries. > > It does if an application detects pthread_fork during configure and > uses it. You then can't use libmap to redirect libpthread to one of > the other thread libraries for testing, since you'll get an undefined > symbol error at runtime. Bah. libc_r is marked for deprecation and libpthread is the default library in -current. > Nikos Ntarmos also noticed that there's no pthread_atfork manpage. We > could probably just use the Single Unix one. Yes, you can now that The Open Group have given us permission :-) -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Thu Jun 24 22:58:24 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8167E16A4CF for ; Thu, 24 Jun 2004 22:58:24 +0000 (GMT) Received: from web13424.mail.yahoo.com (web13424.mail.yahoo.com [216.136.175.155]) by mx1.FreeBSD.org (Postfix) with SMTP id 5BD4143D1D for ; Thu, 24 Jun 2004 22:58:24 +0000 (GMT) (envelope-from pfgshield-pedro@yahoo.com) Message-ID: <20040624225729.21122.qmail@web13424.mail.yahoo.com> Received: from [63.171.232.246] by web13424.mail.yahoo.com via HTTP; Fri, 25 Jun 2004 00:57:29 CEST Date: Fri, 25 Jun 2004 00:57:29 +0200 (CEST) From: To: freebsd-threads@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Subject: Mach Cthreads?? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jun 2004 22:58:24 -0000 Hi; I'm taking a "rest" and saw a brief mention about Mach Cthreads on a book. It's nice to be able to have more than one high quality thread implementation on FreeBSD, although having them all using same API (posix) is not as interesting as might be having more variety. Just for my selfish curiosity.. is porting Cthreads to use KSE something feasible? I looked around for more information but I only found references to the GNU Hurd (which probably has license restrictions and has been badly modified anyways), and MacOS X, but not anywhere I can download the packages or documentation. Any links are welcome although, I repeat, it's only for my selfish curiosity. cheers, Pedro. ____________________________________________________________ Yahoo! Companion - Scarica gratis la toolbar di Ricerca di Yahoo! http://companion.yahoo.it From owner-freebsd-threads@FreeBSD.ORG Fri Jun 25 00:25:58 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EDB2316A4CE for ; Fri, 25 Jun 2004 00:25:58 +0000 (GMT) Received: from rwcrmhc13.comcast.net (rwcrmhc13.comcast.net [204.127.198.39]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9D83043D46 for ; Fri, 25 Jun 2004 00:25:58 +0000 (GMT) (envelope-from julian@elischer.org) Received: from interjet.elischer.org ([24.7.73.28]) by comcast.net (rwcrmhc13) with ESMTP id <2004062500255701500cvgq1e>; Fri, 25 Jun 2004 00:25:58 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id RAA90638; Thu, 24 Jun 2004 17:25:56 -0700 (PDT) Date: Thu, 24 Jun 2004 17:25:54 -0700 (PDT) From: Julian Elischer To: pfgshield-pedro@yahoo.com In-Reply-To: <20040624225729.21122.qmail@web13424.mail.yahoo.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-threads@FreeBSD.org Subject: Re: Mach Cthreads?? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jun 2004 00:25:59 -0000 On Fri, 25 Jun 2004 pfgshield-pedro@yahoo.com wrote: > Hi; > > I'm taking a "rest" and saw a brief mention about Mach Cthreads on a book. It's > nice to be able to have more than one high quality thread implementation on > FreeBSD, although having them all using same API (posix) is not as interesting > as might be having more variety. > > Just for my selfish curiosity.. is porting Cthreads to use KSE something > feasible? I looked around for more information but I only found references to > the GNU Hurd (which probably has license restrictions and has been badly > modified anyways), and MacOS X, but not anywhere I can download the packages or > documentation. It's been a LONG time since I saw Cthreads, but I imagine it should be feasible. Mach had a similar proc/thread relationship to what we have implemented. > > Any links are welcome although, I repeat, it's only for my selfish curiosity. > > cheers, > > Pedro. > > > > > > > ____________________________________________________________ > Yahoo! Companion - Scarica gratis la toolbar di Ricerca di Yahoo! > http://companion.yahoo.it > _______________________________________________ > freebsd-threads@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-threads > To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" >