Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Jan 2010 11:00:41 +0530
From:      "C. Jayachandran" <c.jayachandran@gmail.com>
To:        Juli Mallett <jmallett@freebsd.org>
Cc:        freebsd-mips@freebsd.org, Neel Natu <neelnatu@gmail.com>
Subject:   Re: Code review: groundwork for SMP
Message-ID:  <98a59be81001282130n1776b31bn3f6995b6ef136ff0@mail.gmail.com>
In-Reply-To: <eaa228be1001282040r416f3764lde577786347a4d5e@mail.gmail.com>
References:  <dffe84831001262336l1978797g8b12fab815f4eb52@mail.gmail.com> <20100128.132114.1004138037722505681.imp@bsdimp.com> <dffe84831001281401n7c9fb64bjb38260943448f315@mail.gmail.com> <66207A08-F691-4603-A6C5-9C675414C91E@lakerest.net> <eaa228be1001282040r416f3764lde577786347a4d5e@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I'm new to this list, joined yesterday.=C2=A0 I work at RMI (now Netlogic),
and did part of our internal port of FreeBSD 6.4 to XLR/XLS
processors.

> So on your systems threads share the TLB? =C2=A0Wired TLB entries can't b=
e
> pulled out (in the case of the kernel stack it's basically
> catastrophic for that to happen.) =C2=A0A compromise if your TLB entries
> are really at a premium is to use a single large entry (using, say, a
> single 32k page) that contains both PCPU and the kernel stack, or a
> page which has pointers to pcpu data, the kernel stack, etc. =C2=A0I seem
> to recall seeing a port of FreeBSD that used the same storage for the
> kernel stack and PCPU data, but I could be mistaken.

Our cpus can be configured in a way that they share the 64 TLB entries
among the 4 threads in the core. You could also configure the threads
so that they have 16 independent entires each.  But 16 is too less for
running FreeBSD, so by default we used the shared TLB mode.

> There are other trade-offs available, of course. =C2=A0If we don't use th=
e
> gp for accessing small data, we can keep a pointer to the pcpu data of
> a CPU in gp whenever the kernel is running, and then PCPU accesses are
> just a madder of loading from offset+gp, which is very quick =E2=80=94 fa=
ster
> than the wired TLB entry mechanism, unless you use a virtual address
> for the pcpu in which case it can be painful. =C2=A0As there are more
> things like VIMAGE, the amount of small global data in the kernel is
> going to fall and making gp a pcpu pointer makes more sense. =C2=A0My old
> port used -G0 and I still disable use of the gp in my non-FreeBSD MIPS
> work =E2=80=94 I think NetBSD used to but I haven't noticed what FreeBSD =
does.

Again on XLR processors, there are per-thread scratch registers in
COP0. So our preferred way of doing this was to have the per-cpu
pointer in one of these scratch registers.  We can also get the TLB
out of the way for some of these by reserving KSEG0 region on startup
for these and for stack.

I agree with Randall here, the preferred way is to avoid wiring the
TLB entries.  Can't we reserve some area for this at start-up and keep
the pointer in a platform-specific macro?

> More curiosity than anything (since I don't seem to be able to get an
> RMI system to develop on): if the threads are sharing the TLB, how are
> updates to TLB-related fields synchronized? =C2=A0How do you atomically
> increase the wired count of the TLB? =C2=A0How does 'tlbwr' work? =C2=A0D=
o you
> have to use special instructions when you're sharing the TLB that are
> XLR-specific?

Each thread has its own COP0 registers, but they update the core's
TLB, so there are no special TLB instructions.

Regards,
JC.

--
C. Jayachandran =C2=A0 =C2=A0c.jayachandran@gmail.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?98a59be81001282130n1776b31bn3f6995b6ef136ff0>