From owner-freebsd-threads@FreeBSD.ORG Thu Jun 5 05:02:44 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7BD6F37B401 for ; Thu, 5 Jun 2003 05:02:44 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id BCD4543FB1 for ; Thu, 5 Jun 2003 05:02:36 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfmr7.dialup.mindspring.com ([165.247.219.103] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19NtRv-000180-00; Thu, 05 Jun 2003 05:02:31 -0700 Message-ID: <3EDF3113.A785CEA4@mindspring.com> Date: Thu, 05 Jun 2003 05:01:23 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Kern Sibbald References: <1054745115.13630.517.camel@rufus> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a44c2e105390c21475c5a1d114f98fa148350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: Daniel Eischen cc: freebsd-threads@freebsd.org Subject: Re: FreeBSD pthread_equal "bug" X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jun 2003 12:02:44 -0000 Kern Sibbald wrote: > > This is a bug in the application; the implementation is allowed > > to reuse thread id's and there are enough interfaces for an > > application to tell when a thread terminates (pthread_join). > > I'm not sure what the POSIX specification says, > if I were programming it, I would not be content > with the FreeBSD current implementation especially > considering that both Solaris and Linux do it "correctly". For some number of intervening threads less than 30,000. > This bug does not highlight bad applications because most > programmers will reasonably expect that pthread_equal() will > not be the same for two different threads. It took me > a long time to find this problem because I just could not > imagine that pthread_equal() was not "working". This is a statistical expectation, at best. Technically, you should avoid anything that makes only statistical guarantees. For example, it's possible to run without memory protection through address space separation: what you do is just make sure your physical memory size statistically small compared to your available address space; for example, say I have 4G of physical RAM, and I have a 64 bit processor. The chance of me "guessing" a valid page without faulting is 1:2^32, so I don't "need" to enforce address space separation, and I avoid all sorts of TLB shootdowns and protection domain crossing, etc. Comparatively, your protection against a failure with the pthread_equal() call on the systems you are using as an example of "correct" is less than 1:2^15. This example is particularly apt, since threads share a process address space. > This problem is extremely subtle and is likely to cause > unsuspecting applications long months of bizarre > behavior. Yes. This is the real problem: application expectations. As I implied in my previous message, you have to draw a firm line between "application expectations" and "strict conformance of applications to the standard". There are already several compromises in FreeBSD's implementations for applications expectations; "in for a penny, in for a pound". > Fix it or not, that is your choice. Now that I know > that you don't handle it as I would suspect I can code > around it. IMO, a request to "fix" this (provide an implementation kludge that will keep applications happy) is a lot more reasonable than locking up ld.so in contravention of the SUSv2 Chapter 12. As you point out, all it would take is the addition of a generation count to the pthreads structures; if they are type-stable enough to reuse the same address, then a generation count is not unreasonable, and if they aren't type-stable enough, then it's not a problem in the first place (it's just your particular application has degenerate behaviour in memory reuse). -- Terry