Date: Mon, 25 Nov 1996 14:01:32 +0100 From: Poul-Henning Kamp <phk@critter.tfs.com> To: Peter Wemm <peter@spinner.dialix.com> Cc: freebsd-smp@freebsd.org Subject: Re: cvs commit: sys/i386/i386 locore.s swtch.s sys/i386/include pmap.h Message-ID: <1858.848926892@critter.tfs.com> In-Reply-To: Your message of "Mon, 25 Nov 1996 20:38:15 %2B0800." <199611251238.UAA00709@spinner.DIALix.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
In message <199611251238.UAA00709@spinner.DIALix.COM>, Peter Wemm writes: >Poul-Henning Kamp wrote: >> > Log: >> > Implement part 1 of per-cpu private pages. >> >> You'd better get dyson rolled out right away... ;-) > >Yeah, I've spoken to DG on and off about this, and each time we talked >about it we ended up at the conclusion that it was a bit of a kludge >but it was probably the most efficient way of doing it. s/the most/among the most/ I don't doubt that, I just hate the perspectives, that's all :-) >And yes, there are short-term kludges present that will go away shortly, >so don't shoot me just yet for what I did in swtch.s... :-) It's MUCH >cleaner than it was as we got it from the original code. If nothing else because locore.s was cleaned up as part of trying to understand the previous stuff :-) Yes, the initial stuff was certainly not smart about this, and avoiding disturbing the pmap as much as possible is undoubtedly a good idea. >I hope John's not going to have too much of a heart attack.. :-] It doesn't >interfere with the pmap code or the vm system in general, apart from making >the maximum npkt space smaller by 4MB. I'm pretty sure I can get this >to fly tonight, and will probably be able to get rid of the idle procs >and smp_idleloop() in the process. Yeah, well, you still have to keep separate PTD's for each CPU, and make sure to update them all, AND to tell all the cpu's about it when you do -- that's the real trouble I bet. I'm still of the opinion that sticking the logical cpu# in %gs or some other >=8bit register we abduct for the purpose, at least whenever we enter the kernel, and using that as index into arrays will be less pain, and maybe more efficient on top of that, but since I'm not SMPactive at this time I'll not stand in your way... And if it works, then hey... I'm game. The only real benefit I see to this scheme is that you can put the per-cpu idle-kernel-stack somewhere and not worry about it. As long as it fits in the 4K minus the data we stick there. My one particular grief about this is that we will still have to make it extern struct mpstruct mps; #define curproc mps.mp_curproc ... To avoid debugger people shooting us. Having accepted that we could easily make it an #ifdef if you want one model or the other: #ifdef PER_CPU_PAGE extern struct mpstruct mps; #define curproc mps.mp_curproc #else #if PHKS_SCREWBALL_GS_METHOD void __inline__ CPUNUMBER() { asm mumble "mov %eax, %gs" } #else void __inline__ CPUNUMBER() { asm mumble "get it from the apic" } #endif extern struct mpstruct mps[MAXCPU]; #define curproc mps[CPUNUMBER()].mp_curproc #endif And performance/debugging comparisons will be much simpler. (I know, there will be some uglyness in some .s code but that is a minor nuisance compared to the benefit I think. Please at least consider this detour for a moment. -- Poul-Henning Kamp | phk@FreeBSD.ORG FreeBSD Core-team. http://www.freebsd.org/~phk | phk@login.dknet.dk Private mailbox. whois: [PHK] | phk@ref.tfs.com TRW Financial Systems, Inc. Future will arrive by its own means, progress not so.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1858.848926892>