From owner-freebsd-sparc64@FreeBSD.ORG  Wed Jun 15 23:34:50 2011
Return-Path: <owner-freebsd-sparc64@FreeBSD.ORG>
Delivered-To: freebsd-sparc64@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DFC721065670
	for <freebsd-sparc64@freebsd.org>; Wed, 15 Jun 2011 23:34:50 +0000 (UTC)
	(envelope-from marius@alchemy.franken.de)
Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214])
	by mx1.freebsd.org (Postfix) with ESMTP id 7A2298FC08
	for <freebsd-sparc64@freebsd.org>; Wed, 15 Jun 2011 23:34:49 +0000 (UTC)
Received: from alchemy.franken.de (localhost [127.0.0.1])
	by alchemy.franken.de (8.14.4/8.14.4/ALCHEMY.FRANKEN.DE) with ESMTP id
	p5FNYjhI093521; Thu, 16 Jun 2011 01:34:45 +0200 (CEST)
	(envelope-from marius@alchemy.franken.de)
Received: (from marius@localhost)
	by alchemy.franken.de (8.14.4/8.14.4/Submit) id p5FNYjWt093520;
	Thu, 16 Jun 2011 01:34:45 +0200 (CEST) (envelope-from marius)
Date: Thu, 16 Jun 2011 01:34:45 +0200
From: Marius Strobl <marius@alchemy.franken.de>
To: Peter Jeremy <peterjeremy@acm.org>
Message-ID: <20110615233445.GZ7064@alchemy.franken.de>
References: <20110526234728.GA69750@server.vk2pj.dyndns.org>
	<20110527120659.GA78000@alchemy.franken.de>
	<20110601231237.GA5267@server.vk2pj.dyndns.org>
	<20110608224801.GB35494@alchemy.franken.de>
	<20110613235144.GA12470@server.vk2pj.dyndns.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110613235144.GA12470@server.vk2pj.dyndns.org>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-sparc64@freebsd.org
Subject: Re: 'make -j16 universe' gives SIReset
X-BeenThere: freebsd-sparc64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the Sparc <freebsd-sparc64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-sparc64>, 
	<mailto:freebsd-sparc64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-sparc64>
List-Post: <mailto:freebsd-sparc64@freebsd.org>
List-Help: <mailto:freebsd-sparc64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-sparc64>,
	<mailto:freebsd-sparc64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Jun 2011 23:34:51 -0000

On Tue, Jun 14, 2011 at 09:51:44AM +1000, Peter Jeremy wrote:
> On 2011-Jun-09 00:48:01 +0200, Marius Strobl <marius@alchemy.franken.de> wrote:
> >This might be due to the excessive use of sched_lock by SCHED_4BSD
> >and the MD code, f.e. more CPUs means less TLB contexts per CPU which
> >in turn means more flushes that are protect by sched_lock.
> 
> I have noticed that systat reports very high trap & fault counts.

That's basically expected; on USIII and later FreeBSD just flushes
all unlocked TLB entries when we need to flush the userland mappings
and accept TLB misses for the kernel ones instead of traversing
the TLBs for userland mappings and removing just those. Actually
OpenSolaris just does the same thing and IIRC there actually isn't
a way to traverse the large TLBs. Given that the TLB contexts are
divided evenly among the cores this means the more flushes and
misses the more cores are in the machine. Previously FreeBSD shared
the contexts which meant TLB shootdown IPIs even for non-shared
PMAPs. So the question is whether there's some point at which
that approach actually costs less performance than accepting TLB
misses. This seems unlikely though and AFAIK the current approach
actually is inspired by Solaris Internals.

> 
> I got a "spinlock held too long" panic that should have gone to DDB
> but the system wouldn't respond to anything other than a RSC reset.
> 

You could try whether the below patch sufficiently reduces the lock
coverage to avoid these. For stable/8 you'll probably need to apply
the second chunk by hand.

Marius

Index: pmap.c
===================================================================
--- pmap.c	(revision 223042)
+++ pmap.c	(working copy)
@@ -2217,11 +2217,10 @@ pmap_activate(struct thread *td)
 	struct pmap *pm;
 	int context;
 
+	critical_enter();
 	vm = td->td_proc->p_vmspace;
 	pm = vmspace_pmap(vm);
 
-	mtx_lock_spin(&sched_lock);
-
 	context = PCPU_GET(tlb_ctx);
 	if (context == PCPU_GET(tlb_ctx_max)) {
 		tlb_flush_user();
@@ -2229,17 +2228,18 @@ pmap_activate(struct thread *td)
 	}
 	PCPU_SET(tlb_ctx, context + 1);
 
+	mtx_lock_spin(&sched_lock);
 	pm->pm_context[curcpu] = context;
 	CPU_OR(&pm->pm_active, PCPU_PTR(cpumask));
 	PCPU_SET(pmap, pm);
+	mtx_unlock_spin(&sched_lock);
 
 	stxa(AA_DMMU_TSB, ASI_DMMU, pm->pm_tsb);
 	stxa(AA_IMMU_TSB, ASI_IMMU, pm->pm_tsb);
 	stxa(AA_DMMU_PCXR, ASI_DMMU, (ldxa(AA_DMMU_PCXR, ASI_DMMU) &
 	    TLB_CXR_PGSZ_MASK) | context);
 	flush(KERNBASE);
-
-	mtx_unlock_spin(&sched_lock);
+	critical_exit();
 }
 
 void