Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Jan 2010 21:47:04 -0800
From:      Randall Stewart <rrs@lakerest.net>
To:        "C. Jayachandran" <c.jayachandran@gmail.com>
Cc:        freebsd-mips@freebsd.org, Neel Natu <neelnatu@gmail.com>
Subject:   Re: Code review: groundwork for SMP
Message-ID:  <37F434F8-C845-4A20-8188-CA26FB7B8C5C@lakerest.net>
In-Reply-To: <98a59be81001282130n1776b31bn3f6995b6ef136ff0@mail.gmail.com>
References:  <dffe84831001262336l1978797g8b12fab815f4eb52@mail.gmail.com> <20100128.132114.1004138037722505681.imp@bsdimp.com> <dffe84831001281401n7c9fb64bjb38260943448f315@mail.gmail.com> <66207A08-F691-4603-A6C5-9C675414C91E@lakerest.net> <eaa228be1001282040r416f3764lde577786347a4d5e@mail.gmail.com> <98a59be81001282130n1776b31bn3f6995b6ef136ff0@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
comments in-line..

On Jan 28, 2010, at 9:30 PM, C. Jayachandran wrote:

> I'm new to this list, joined yesterday.  I work at RMI (now Netlogic),
> and did part of our internal port of FreeBSD 6.4 to XLR/XLS
> processors.
>
>> So on your systems threads share the TLB?  Wired TLB entries can't be
>> pulled out (in the case of the kernel stack it's basically
>> catastrophic for that to happen.)  A compromise if your TLB entries
>> are really at a premium is to use a single large entry (using, say, a
>> single 32k page) that contains both PCPU and the kernel stack, or a
>> page which has pointers to pcpu data, the kernel stack, etc.  I seem
>> to recall seeing a port of FreeBSD that used the same storage for the
>> kernel stack and PCPU data, but I could be mistaken.
>
> Our cpus can be configured in a way that they share the 64 TLB entries
> among the 4 threads in the core. You could also configure the threads
> so that they have 16 independent entires each.  But 16 is too less for
> running FreeBSD, so by default we used the shared TLB mode.
>
>> There are other trade-offs available, of course.  If we don't use the
>> gp for accessing small data, we can keep a pointer to the pcpu data =20=

>> of
>> a CPU in gp whenever the kernel is running, and then PCPU accesses =20=

>> are
>> just a madder of loading from offset+gp, which is very quick =97 =
faster
>> than the wired TLB entry mechanism, unless you use a virtual address
>> for the pcpu in which case it can be painful.  As there are more
>> things like VIMAGE, the amount of small global data in the kernel is
>> going to fall and making gp a pcpu pointer makes more sense.  My old
>> port used -G0 and I still disable use of the gp in my non-FreeBSD =20
>> MIPS
>> work =97 I think NetBSD used to but I haven't noticed what FreeBSD =20=

>> does.
>
> Again on XLR processors, there are per-thread scratch registers in
> COP0. So our preferred way of doing this was to have the per-cpu
> pointer in one of these scratch registers.  We can also get the TLB
> out of the way for some of these by reserving KSEG0 region on startup
> for these and for stack.

Hmm I wonder if other processors as well have a per-cpu scratch
register we can use. This might be an easy way forward. You load
the scratch reg with the pcpup (I remember seeing that in the
6.x port of yours) and then reference that whenever you want
to use it.

Do all of the mips processors have this scratch registers (I guess
I should scope that.. do the ones we care about have a scratch
register).. or is this an optional feature.

Another question is what is the cost cycle wise of accessing
this register..

Far better I am sure than a TLB miss in user space due to
lack of TLB entries...

But I wonder how it compares to a indexed access that doing a

pcpup =3D &pcpu[getcpuid()];

would cost..

R




>
> I agree with Randall here, the preferred way is to avoid wiring the
> TLB entries.  Can't we reserve some area for this at start-up and keep
> the pointer in a platform-specific macro?
>
>> More curiosity than anything (since I don't seem to be able to get an
>> RMI system to develop on): if the threads are sharing the TLB, how =20=

>> are
>> updates to TLB-related fields synchronized?  How do you atomically
>> increase the wired count of the TLB?  How does 'tlbwr' work?  Do you
>> have to use special instructions when you're sharing the TLB that are
>> XLR-specific?
>
> Each thread has its own COP0 registers, but they update the core's
> TLB, so there are no special TLB instructions.
>
> Regards,
> JC.
>
> --
> C. Jayachandran    c.jayachandran@gmail.com
>

------------------------------
Randall Stewart
803-317-4952 (cell)
803-345-0391(direct)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?37F434F8-C845-4A20-8188-CA26FB7B8C5C>