Skip site navigation (1)Skip section navigation (2)
Date:      05 Jun 2003 14:38:09 +0200
From:      Kern Sibbald <kern@sibbald.com>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        freebsd-threads@freebsd.org
Subject:   Re: FreeBSD pthread_equal "bug"
Message-ID:  <1054816689.13630.713.camel@rufus>
In-Reply-To: <3EDF3113.A785CEA4@mindspring.com>
References:  <Pine.GSO.4.10.10306041126030.13583-100000@pcnet5.pcnet.com> <3EDF3113.A785CEA4@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello,

Yes, it is evident that no matter what one does at
some point the thread id will wrap around -- unless
you use a 64 bit counter. My program wouldn't have had
any problems if it wrapped around some "reasonable" time
later, but in fact, it "wrapped" on the very next thread.

I can see that I was wrong to classify the FreeBSD
as a bug, but it would be valid to say that the FreeBSD
implementation is not as "robust" as the spec would
permit (or advise). Please see:

   http://www.opengroup.org/onlinepubs/007904975/toc.htm

under "Rationale", the view they express is what I consider
a preferred or more robust implementation "additional
flexibility and robustness" ...

I don't quite understand your point about locking up ld.so,
probably because I am not subscribed to the list, and it
isn't worth your time to explain it, but I certainly would
not advocate any change that destabilizes something.

On the other hand, a little loss of performance is a 
good thing if it helps applications run better and/or 
detect their bugs.

As I said, I leave it to you guys to decide what to
do or not to do. I know what I did wrong and have
long ago fixed it

Best regards,

Kern

On Thu, 2003-06-05 at 14:01, Terry Lambert wrote:
> Kern Sibbald wrote:
> > > This is a bug in the application; the implementation is allowed
> > > to reuse thread id's and there are enough interfaces for an
> > > application to tell when a thread terminates (pthread_join).
> > 
> > I'm not sure what the POSIX specification says,
> > if I were programming it, I would not be content
> > with the FreeBSD current implementation especially
> > considering that both Solaris and Linux do it "correctly".
> 
> For some number of intervening threads less than 30,000.
> 
> 
> > This bug does not highlight bad applications because most
> > programmers will reasonably expect that pthread_equal() will
> > not be the same for two different threads.  It took me
> > a long time to find this problem because I just could not
> > imagine that pthread_equal() was not "working".
> 
> This is a statistical expectation, at best.  Technically,
> you should avoid anything that makes only statistical
> guarantees.
> 
> For example, it's possible to run without memory protection
> through address space separation: what you do is just make
> sure your physical memory size statistically small compared
> to your available address space; for example, say I have 4G
> of physical RAM, and I have a 64 bit processor.  The chance
> of me "guessing" a valid page without faulting is 1:2^32,
> so I don't "need" to enforce address space separation, and
> I avoid all sorts of TLB shootdowns and protection domain
> crossing, etc.
> 
> Comparatively, your protection against a failure with the
> pthread_equal() call on the systems you are using as an
> example of "correct" is less than 1:2^15.
> 
> This example is particularly apt, since threads share a
> process address space.
> 
> 
> > This problem is extremely subtle and is likely to cause
> > unsuspecting applications long months of bizarre
> > behavior.
> 
> Yes.  This is the real problem: application expectations.
> 
> As I implied in my previous message, you have to draw a
> firm line between "application expectations" and "strict
> conformance of applications to the standard".  There are
> already several compromises in FreeBSD's implementations
> for applications expectations; "in for a penny, in for a
> pound".
> 
> 
> > Fix it or not, that is your choice. Now that I know
> > that you don't handle it as I would suspect I can code
> > around it.
> 
> IMO, a request to "fix" this (provide an implementation
> kludge that will keep applications happy) is a lot more
> reasonable than locking up ld.so in contravention of the
> SUSv2 Chapter 12.  As you point out, all it would take
> is the addition of a generation count to the pthreads
> structures; if they are type-stable enough to reuse the
> same address, then a generation count is not unreasonable,
> and if they aren't type-stable enough, then it's not a
> problem in the first place (it's just your particular
> application has degenerate behaviour in memory reuse).
> 
> -- Terry



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1054816689.13630.713.camel>