From owner-freebsd-threads@FreeBSD.ORG Wed Jan 30 19:41:18 2008 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E1B516A420 for ; Wed, 30 Jan 2008 19:41:18 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outD.internet-mail-service.net (outD.internet-mail-service.net [216.240.47.227]) by mx1.freebsd.org (Postfix) with ESMTP id 6114613C46B for ; Wed, 30 Jan 2008 19:41:18 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.40) with ESMTP; Wed, 30 Jan 2008 11:41:17 -0800 Received: from julian-mac.elischer.org (localhost [127.0.0.1]) by idiom.com (Postfix) with ESMTP id 33814127059; Wed, 30 Jan 2008 11:41:17 -0800 (PST) Message-ID: <47A0D2DE.9060005@elischer.org> Date: Wed, 30 Jan 2008 11:41:18 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Landon Fuller References: <200801240850.m0O8o2JQ023500@freefall.freebsd.org> <4798564B.7070500@elischer.org> <488DBC6A-CF33-4E50-B1BB-C396C8957F92@bikemonkey.org> <892A73B3-0114-4718-ABC0-CADD45D9D0FA@bikemonkey.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-threads@freebsd.org Subject: Re: threads/119920: fork broken in libpthread X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Jan 2008 19:41:18 -0000 Landon Fuller wrote: > > On Jan 29, 2008, at 20:23, Daniel Eischen wrote: > >> On Tue, 29 Jan 2008, Landon Fuller wrote: >> >>> >>> On Jan 29, 2008, at 13:13, Daniel Eischen wrote: >>>> There is a bug somewhere else or something is stomping >>>> on the thread's lockuser. It is allocated once when the >>>> thread is created and should never be null thereafter. >>>> Hence, it should never be malloc'd and the reinit should >>>> be sufficient. >>> >>> I'm totally unfamiliar with KSE, so perhaps this a stupid question -- >>> it seems to solve the issue locally, so I'll ask it -- >>> Why not place the fork() code inside of _kse_critical_enter / >>> _kse_critical_leave, to ensure upcalls are blocked while >>> re-initializing in the child process post-fork? >> >> That just prevents an upcall from happening (which my patch solves), >> but doesn't prevent the corruption of the lockuser or lock. > > My assumption was that lockuser was being corrupted in a post-fork > upcall; after wrapping fork in kse_critical_enter(), and can not > reproduce the lockuser-reallocation I was seeing before. > > Is my assumption that kse_critical_enter() prevents any other code from > being run during the critical section incorrect? My understanding (not being a userland expert) is that critical-enter sets a flag in the thread's mailbox telling the kernel that no matter what happens, it must not allow another thread to pre-empt this one. This means that any system call (or page fault or whatever) will be handled synchronously. if it blocks, that schedulable entity will block and no upcall will be made to schedule another thread. It does not in any way however ensure that any other thread changes its behaviour. Dan, is my understanding correct? > > --- thread/thr_fork.c.orig 2008-01-30 11:08:45.000000000 -0800 > +++ thread/thr_fork.c 2008-01-30 11:09:36.000000000 -0800 > @@ -93,12 +93,16 @@ > if (_kse_isthreaded() != 0) { > _spinlock(__malloc_lock); > } > + > + kse_critical_t crit = _kse_critical_enter(); > + > if ((ret = __sys_fork()) == 0) { > /* Child process */ > errsave = errno; > > /* Kernel signal mask is restored in _kse_single_thread */ > _kse_single_thread(curthread); > + _kse_critical_leave(crit); > > /* Run down atfork child handlers. */ > TAILQ_FOREACH(af, &_thr_atfork_list, qe) { > @@ -107,6 +111,7 @@ > } > _thr_mutex_reinit(&_thr_atfork_mutex); > } else { > + _kse_critical_leave(crit); > if ((_kse_isthreaded() != 0) && (__malloc_lock != NULL)) { > _spinunlock(__malloc_lock); > } > >> My patch does solve this in -current, but -stable probably lacks >> a few other patches. It (-stable) really needs all of -current's >> code, not just this patch. > > > Is this change viable for an errata fix for 6.3? > > -landonf