From owner-freebsd-threads@FreeBSD.ORG  Wed Jan 30 19:41:18 2008
Return-Path: <owner-freebsd-threads@FreeBSD.ORG>
Delivered-To: freebsd-threads@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7E1B516A420
	for <freebsd-threads@freebsd.org>; Wed, 30 Jan 2008 19:41:18 +0000 (UTC)
	(envelope-from julian@elischer.org)
Received: from outD.internet-mail-service.net (outD.internet-mail-service.net
	[216.240.47.227])
	by mx1.freebsd.org (Postfix) with ESMTP id 6114613C46B
	for <freebsd-threads@freebsd.org>; Wed, 30 Jan 2008 19:41:18 +0000 (UTC)
	(envelope-from julian@elischer.org)
Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160)
	by out.internet-mail-service.net (qpsmtpd/0.40) with ESMTP;
	Wed, 30 Jan 2008 11:41:17 -0800
Received: from julian-mac.elischer.org (localhost [127.0.0.1])
	by idiom.com (Postfix) with ESMTP id 33814127059;
	Wed, 30 Jan 2008 11:41:17 -0800 (PST)
Message-ID: <47A0D2DE.9060005@elischer.org>
Date: Wed, 30 Jan 2008 11:41:18 -0800
From: Julian Elischer <julian@elischer.org>
User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031)
MIME-Version: 1.0
To: Landon Fuller <landonf@bikemonkey.org>
References: <200801240850.m0O8o2JQ023500@freefall.freebsd.org>	<4798564B.7070500@elischer.org>	<Pine.GSO.4.64.0801240957550.16059@sea.ntplx.net>	<488DBC6A-CF33-4E50-B1BB-C396C8957F92@bikemonkey.org>	<Pine.GSO.4.64.0801291611130.12689@sea.ntplx.net>	<892A73B3-0114-4718-ABC0-CADD45D9D0FA@bikemonkey.org>	<Pine.GSO.4.64.0801292320260.15096@sea.ntplx.net>
	<D3DFD2DE-E6A3-4373-AC37-42C3CAB9B963@bikemonkey.org>
In-Reply-To: <D3DFD2DE-E6A3-4373-AC37-42C3CAB9B963@bikemonkey.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-threads@freebsd.org
Subject: Re: threads/119920: fork broken in libpthread
X-BeenThere: freebsd-threads@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Threading on FreeBSD <freebsd-threads.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>, 
	<mailto:freebsd-threads-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-threads>
List-Post: <mailto:freebsd-threads@freebsd.org>
List-Help: <mailto:freebsd-threads-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Jan 2008 19:41:18 -0000

Landon Fuller wrote:
> 
> On Jan 29, 2008, at 20:23, Daniel Eischen wrote:
> 
>> On Tue, 29 Jan 2008, Landon Fuller wrote:
>>
>>>
>>> On Jan 29, 2008, at 13:13, Daniel Eischen wrote:
>>>> There is a bug somewhere else or something is stomping
>>>> on the thread's lockuser.  It is allocated once when the
>>>> thread is created and should never be null thereafter.
>>>> Hence, it should never be malloc'd and the reinit should
>>>> be sufficient.
>>>
>>> I'm totally unfamiliar with KSE, so perhaps this a stupid question -- 
>>> it seems to solve the issue locally, so I'll ask it --
>>> Why not place the fork() code inside of _kse_critical_enter / 
>>> _kse_critical_leave, to ensure upcalls are blocked while 
>>> re-initializing in the child process post-fork?
>>
>> That just prevents an upcall from happening (which my patch solves),
>> but doesn't prevent the corruption of the lockuser or lock.
> 
> My assumption was that lockuser was being corrupted in a post-fork 
> upcall; after wrapping fork in kse_critical_enter(), and can not 
> reproduce the lockuser-reallocation I was seeing before.
> 
> Is my assumption that kse_critical_enter() prevents any other code from 
> being run during the critical section incorrect?

My understanding (not being a userland expert) is that
critical-enter sets a flag in the thread's mailbox telling the
kernel that no matter what happens, it must not allow another
thread to pre-empt this one. This means that any system call
(or page fault or whatever) will be handled synchronously. if
it blocks, that schedulable entity will block and no upcall
will be made to schedule another thread.

It does not in any way however ensure that any other thread changes 
its behaviour.
Dan, is my understanding correct?

> 
> --- thread/thr_fork.c.orig      2008-01-30 11:08:45.000000000 -0800
> +++ thread/thr_fork.c   2008-01-30 11:09:36.000000000 -0800
> @@ -93,12 +93,16 @@
>         if (_kse_isthreaded() != 0) {
>                 _spinlock(__malloc_lock);
>         }
> +
> +       kse_critical_t crit = _kse_critical_enter();
> +
>         if ((ret = __sys_fork()) == 0) {
>                 /* Child process */
>                 errsave = errno;
> 
>                 /* Kernel signal mask is restored in _kse_single_thread */
>                 _kse_single_thread(curthread);
> +               _kse_critical_leave(crit);
> 
>                 /* Run down atfork child handlers. */
>                 TAILQ_FOREACH(af, &_thr_atfork_list, qe) {
> @@ -107,6 +111,7 @@
>                 }
>                 _thr_mutex_reinit(&_thr_atfork_mutex);
>         } else {
> +               _kse_critical_leave(crit);
>                 if ((_kse_isthreaded() != 0) && (__malloc_lock != NULL)) {
>                         _spinunlock(__malloc_lock);
>                 }
> 
>> My patch does solve this in -current, but -stable probably lacks
>> a few other patches.  It (-stable) really needs all of -current's
>> code, not just this patch.
> 
> 
> Is this change viable for an errata fix for 6.3?
> 
> -landonf