From owner-freebsd-mips@FreeBSD.ORG Fri Jan 29 15:25:32 2010 Return-Path: Delivered-To: freebsd-mips@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F41FE106568F; Fri, 29 Jan 2010 15:25:31 +0000 (UTC) (envelope-from neelnatu@gmail.com) Received: from mail-pz0-f176.google.com (mail-pz0-f176.google.com [209.85.222.176]) by mx1.freebsd.org (Postfix) with ESMTP id BFBDD8FC14; Fri, 29 Jan 2010 15:25:31 +0000 (UTC) Received: by pzk6 with SMTP id 6so1479240pzk.3 for ; Fri, 29 Jan 2010 07:25:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=OwDCdBXA+t7BQVc4klNV7ZpyWZx8jOqlZxWW0qXg9pw=; b=EuSxlypaYJI6Sh6DmmFokPmOa8tcLoQ657nVokMBHGFxlxN7hYD4WlrJfkcgT4MtVx x0ZrtoMR5qbR6KqgwH573RBkK+QvI4zM4KILQOAGi3WT2zgW0P3QOaNvTAsIkcuxd51N FU/r6SCVEK5opzWAWjAe3Mbu2TSSEE+NeWE4c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=VAE4fVVv2wGJqwAmS0ItdjgOHTYylNkafdaxLHj0vl00nDR/095bQ1O0RFNBAISU1R da45qO+BRicUKtaciFVjhkMqRxcsw9fpeAUJEvlsRwNSeiUoUa4a0PiT7Lu50nBl9oWL 0P29gs7FFcVT9Zc1NejHlrONO8W3bfjjf+bn0= MIME-Version: 1.0 Received: by 10.142.56.12 with SMTP id e12mr603690wfa.332.1264778731105; Fri, 29 Jan 2010 07:25:31 -0800 (PST) In-Reply-To: References: <20100128.132114.1004138037722505681.imp@bsdimp.com> <66207A08-F691-4603-A6C5-9C675414C91E@lakerest.net> <85D9D383-29A3-4F09-A2FE-61E4EA85CE9B@lakerest.net> Date: Fri, 29 Jan 2010 07:25:31 -0800 Message-ID: From: Neel Natu To: Juli Mallett Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Cc: freebsd-mips@freebsd.org Subject: Re: Code review: groundwork for SMP X-BeenThere: freebsd-mips@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to MIPS List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jan 2010 15:25:32 -0000 Thanks Juli, Randall and JC for the comments. I think it is fair to ask that we don't burn another TLB entry to store the pcpu data. So maybe it might help if I went through what options I considered before settling on this one: - One of the first things that I did investigate was using per-cpu scratch registers but the Sibyte did not have any and they are not part of the MIPS architecture. - The second thing I considered was using a platform-specific getcpuid() to index into the struct pcpu pcpu[MAXCPU] array to compute the KSEG0 address of pcpu at runtime. However this turned out to be a bit messy because there are consumers of getcpuid() in exception context where we are restricted to using only k0 and k1 (and sometimes only one of them). Also, like Juli pointed out getcpuid() is slow on some cpus and I did not want to make the assumption that one could write getcpuid() using a single k0/k1 register. So, having the pcpu pointer in a TLB entry divorces us from any assumptions about the CPU we are running on. I think that there is a legitimate concern about this on the XLR - but given that you are sharing the TLB among 4 threads I think there is the bigger issue of the wired kstack entries that you need to solve before even thinking about pcpu mapping. I did not consider the approach suggested by Juli where the pcpu and kstack pointers can be stashed in a single wired TLB entry. I need some time to chew on it and prototype it. I would still like to commit this so as to keep making progress on the SMP support. This is a small piece of the bigger goal of getting SMP functional and can be replaced in the future if need be. best Neel On Thu, Jan 28, 2010 at 10:42 PM, Juli Mallett wrote= : > On Thu, Jan 28, 2010 at 21:28, Randall Stewart wrote: >>> [ Using a single wired TLB entry for kstack and pcpu ] >> >> Which means you have a big array that you are offsetting. > > Not really =97 you can have a structure at 0xc000000000000000u (or the > same >> 32) with two pointers in it, even, one to pcpu and one to > KSTACK_PAGES direct-mapped, contiguous pages. =A0Then you can load the > kstack address or the pcpu base very quickly. =A0Of course, you can even > have a single wired entry consisting of the pcpu data and then put a > pointer to the top of the kstack in it. =A0I don't think you can get by > with no wired TLB entries, but you also don't have to index a big > array. =A0The nice thing about setting up a per-CPU TLB entry (you have > to set up at least one, the kstack, in order to be able to handle > exceptions) is that then you need only access offsets into it that are > known at compile time and constant no matter what CPU you're running > on. =A0Load the kstack by doing "ld sp, 0(0xc...)" and load the pcpu > address by doing "ld t0, 8(0xc....)". =A0Two wired entries lets you get > rid of the indirection, but you can get by with one and still not have > to do (1) run-time computation of the index into some array (2) > possibly very expensive getting of the cpuid. > >> I was even thinking get a LARGE entry.. one that is say 8 Meg >> for the kernel.. covering all text/data etc... with this >> new super page stuff. of course I have never looked into how >> its implemented.. > > That would be easy to do, but what would be the benefits of accessing > that data through a wired TLB entry instead of the direct map? > >> Yes, you pay an index reference for every access .. or at >> least one to setup a pointer.. but I think that it much cheaper >> than a TLB miss is... (words for imp to think about)... > > Yes, TLB misses are very slow. =A0Your desire to avoid adding another > wired entry seems pretty reasonable. =A0I think that using a single > wired TLB entry for a mux or for both the kstack and pcpu is easy and > usable. =A0I feel like just wiring the kstack and putting a > direct-mapped, sometimes-recomputed pointer to the pcpu into gp is the > best combination in the long run =97 even just loading an immediate > 64-bit address is pretty slow wrt how often things in the PCPU are > accessed in SMP kernels. > > Juli. >