Date: Fri, 29 Jan 2010 11:00:41 +0530 From: "C. Jayachandran" <c.jayachandran@gmail.com> To: Juli Mallett <jmallett@freebsd.org> Cc: freebsd-mips@freebsd.org, Neel Natu <neelnatu@gmail.com> Subject: Re: Code review: groundwork for SMP Message-ID: <98a59be81001282130n1776b31bn3f6995b6ef136ff0@mail.gmail.com> In-Reply-To: <eaa228be1001282040r416f3764lde577786347a4d5e@mail.gmail.com> References: <dffe84831001262336l1978797g8b12fab815f4eb52@mail.gmail.com> <20100128.132114.1004138037722505681.imp@bsdimp.com> <dffe84831001281401n7c9fb64bjb38260943448f315@mail.gmail.com> <66207A08-F691-4603-A6C5-9C675414C91E@lakerest.net> <eaa228be1001282040r416f3764lde577786347a4d5e@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I'm new to this list, joined yesterday.=C2=A0 I work at RMI (now Netlogic), and did part of our internal port of FreeBSD 6.4 to XLR/XLS processors. > So on your systems threads share the TLB? =C2=A0Wired TLB entries can't b= e > pulled out (in the case of the kernel stack it's basically > catastrophic for that to happen.) =C2=A0A compromise if your TLB entries > are really at a premium is to use a single large entry (using, say, a > single 32k page) that contains both PCPU and the kernel stack, or a > page which has pointers to pcpu data, the kernel stack, etc. =C2=A0I seem > to recall seeing a port of FreeBSD that used the same storage for the > kernel stack and PCPU data, but I could be mistaken. Our cpus can be configured in a way that they share the 64 TLB entries among the 4 threads in the core. You could also configure the threads so that they have 16 independent entires each. But 16 is too less for running FreeBSD, so by default we used the shared TLB mode. > There are other trade-offs available, of course. =C2=A0If we don't use th= e > gp for accessing small data, we can keep a pointer to the pcpu data of > a CPU in gp whenever the kernel is running, and then PCPU accesses are > just a madder of loading from offset+gp, which is very quick =E2=80=94 fa= ster > than the wired TLB entry mechanism, unless you use a virtual address > for the pcpu in which case it can be painful. =C2=A0As there are more > things like VIMAGE, the amount of small global data in the kernel is > going to fall and making gp a pcpu pointer makes more sense. =C2=A0My old > port used -G0 and I still disable use of the gp in my non-FreeBSD MIPS > work =E2=80=94 I think NetBSD used to but I haven't noticed what FreeBSD = does. Again on XLR processors, there are per-thread scratch registers in COP0. So our preferred way of doing this was to have the per-cpu pointer in one of these scratch registers. We can also get the TLB out of the way for some of these by reserving KSEG0 region on startup for these and for stack. I agree with Randall here, the preferred way is to avoid wiring the TLB entries. Can't we reserve some area for this at start-up and keep the pointer in a platform-specific macro? > More curiosity than anything (since I don't seem to be able to get an > RMI system to develop on): if the threads are sharing the TLB, how are > updates to TLB-related fields synchronized? =C2=A0How do you atomically > increase the wired count of the TLB? =C2=A0How does 'tlbwr' work? =C2=A0D= o you > have to use special instructions when you're sharing the TLB that are > XLR-specific? Each thread has its own COP0 registers, but they update the core's TLB, so there are no special TLB instructions. Regards, JC. -- C. Jayachandran =C2=A0 =C2=A0c.jayachandran@gmail.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?98a59be81001282130n1776b31bn3f6995b6ef136ff0>