Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Jul 2012 09:58:49 +0200
From:      John Marino <freebsdml@marino.st>
To:        Alfred Perlstein <alfred@freebsd.org>
Cc:        freebsd-threads@freebsd.org
Subject:   Re: Fingerpointing about broken Ada tasking starting with FreeBSD 9.0 threading
Message-ID:  <50090FB9.6050606@marino.st>
In-Reply-To: <20120719212326.GN98608@elvis.mu.org>
References:  <500854EC.3040305@marino.st> <20120719212326.GN98608@elvis.mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 7/19/2012 23:23, Alfred Perlstein wrote:
> Hey John,
>
> I find the best way to figure stuff like this out would be to
> instrument the code.
>
> I think what could happen here is simply adding a FILE,LINE to the struct
> thread and have THR_CRITICAL_ENTER record the last place it was called
> by stuffing the current __FILE__ and __LINE__ into those variables.
>
> Then when you hit that assertion you can dump the last place.
>
> The only problem you could face with such a system is a false positive
> if the code goes multiple levels deep, you'll probably want to clear
> the data there when you see a THR_CRITICAL_LEAVE.
>
> Then if in your assertion you see that it's clear/NULL then you want to
> probably implement a static stack and use (thrd)->critical_count and
> (thrd)->locklevel as indecies to respective traceback stacks.
>
> It really shouldn't take more than a few hours to write the instrumentation
> code and I could see it staying inside the code under a PTHREAD_HEAVY_DEBUG
> flag if needed.

Hi Alfred,
Thanks for providing some techniques that can perhaps help track down 
what's going on.

I'm still interested in the big picture, though.
We've got a package that runs on FreeBSD 6, 7, 8 and broke on 9.
Similarly the "critical_count" property is at the expected 0 value on 
thread exit on DragonFly.

The new thread panic caused a regression -
Was it necessary to put this panic there?
What are the consequences of continuing?
Were these resources being "held" before when a mutex was used and just 
not detected? (which implies consequences are not high so why panic?)
Has there been other fallout from this change?

I'm guessing your first inclination would be to blame GNAT, and say that 
if the crit count is wrong, something must not be getting cleaned up and 
you may be right, but the fact remains that software that builds and 
runs on FreeBSD 6, 7, and 8 doesn't run on 9.  I assume that was unintended.

John





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50090FB9.6050606>