From owner-freebsd-hackers  Tue Dec  4 10:46:38 2001
Delivered-To: freebsd-hackers@freebsd.org
Received: from elvis.mu.org (elvis.mu.org [216.33.66.196])
	by hub.freebsd.org (Postfix) with ESMTP
	id 73B6637B405; Tue,  4 Dec 2001 10:46:30 -0800 (PST)
Received: by elvis.mu.org (Postfix, from userid 1192)
	id F00BF81D01; Tue,  4 Dec 2001 12:46:24 -0600 (CST)
Date: Tue, 4 Dec 2001 12:46:24 -0600
From: Alfred Perlstein <bright@mu.org>
To: Daniel Eischen <deischen@gdeb.com>
Cc: Dan Eischen <eischen@vigrid.com>,
	Louis-Philippe Gagnon <louisphilippe@macadamian.com>,
	freebsd-current@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG
Subject: Re: Possible libc_r pthread bug
Message-ID: <20011204124624.L92148@elvis.mu.org>
References: <094601c179ea$7cca85c0$2964a8c0@MACADAMIAN.com> <Pine.SUN.3.91.1011130170847.14642A-100000@pcnet1.pcnet.com> <20011204021815.E92148@elvis.mu.org> <3C0CC2FE.275F4C68@vigrid.com> <20011204114236.H92148@elvis.mu.org> <3C0D1680.E3461FB@gdeb.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <3C0D1680.E3461FB@gdeb.com>; from deischen@gdeb.com on Tue, Dec 04, 2001 at 01:31:28PM -0500
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG

* Daniel Eischen <deischen@gdeb.com> [011204 12:32] wrote:
> Alfred Perlstein wrote:
> > 
> > * Dan Eischen <eischen@vigrid.com> [011204 06:26] wrote:
> > >
> > > There are already cancellation tests when resuming threads
> > > whose contexts are not saved as a result of a signal interrupt
> > > (ctxtype != CTX_UC). You shouldn't test for cancellation when
> > > ctxtype == CTX_UC because you are running on the scheduler
> > > stack, not the threads stack.
> > 
> > That makes sense, but why?
> 
> Because when a thread gets cancelled, pthread_exit gets called
> which then calls the scheduler again.  It is also possible to
> get interrupted during this process and the threads context
> (which is operating on the scheduler stack) could get saved.
> The scheduler could get entered again, and if the thread
> gets resumed, it'll longjmp to the saved context which is the
> scheduler stack (and which was just trashed by entering the
> scheduler again).
> 
> It is too confusing to try to handle conditions like this, and
> the threads library doesn't need to get any more confusing ;-)
> Once the scheduler is entered, no pthread routines should
> be called and the scheduler should not be recursively
> entered.  The only way out of the scheduler should be a
> longjmp or sigreturn to a saved threads context.

Ok, for the sake of beating a clue into me...

in uthread_kern.c:_thread_kern_sched

                /* Save the state of the current thread: */
                if (_setjmp(curthread->ctx.jb) == 0) {
                        /* Flag the jump buffer was the last state saved: */
                        curthread->ctxtype = CTX_JB_NOSIG;
                        curthread->longjmp_val = 1;
                } else {
                        DBG_MSG("Returned from ___longjmp, thread %p\n",
                            curthread);
                        /*
                         * This point is reached when a longjmp() is called
                         * to restore the state of a thread.
                         *
                         * This is the normal way out of the scheduler.
                         */
                        _thread_kern_in_sched = 0;

                        if (curthread->sig_defer_count == 0) {
                                if (((curthread->cancelflags &
                                    PTHREAD_AT_CANCEL_POINT) == 0) &&
                                    ((curthread->cancelflags &
                                    PTHREAD_CANCEL_ASYNCHRONOUS) != 0))
                                        /*
                                         * Cancellations override signals.
                                         *
                                         * Stick a cancellation point at the
                                         * start of each async-cancellable
                                         * thread's resumption.
                                         *
                                         * We allow threads woken at cancel
                                         * points to do their own checks.
                                         */
                                        pthread_testcancel();
                        }

Why isn't this "working", shouldn't it be doing the right thing?
What if curthread->sig_defer_count wasn't tested?
Maybe this should be a test against curthread->sig_defer_count <= 1?

I'll play with this some more when I get back to my box at home,
it just seems bizarro to me.


-- 
-Alfred Perlstein [alfred@freebsd.org]
'Instead of asking why a piece of software is using "1970s technology,"
 start asking why software is ignoring 30 years of accumulated wisdom.'
                           http://www.morons.org/rants/gpl-harmful.php3

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message