From owner-freebsd-arch Wed Aug 22 0:29:26 2001 Delivered-To: freebsd-arch@freebsd.org Received: from snipe.mail.pas.earthlink.net (snipe.mail.pas.earthlink.net [207.217.120.62]) by hub.freebsd.org (Postfix) with ESMTP id E78C437B406; Wed, 22 Aug 2001 00:29:02 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.245.135.228.Dial1.SanJose1.Level3.net [209.245.135.228]) by snipe.mail.pas.earthlink.net (8.11.5/8.9.3) with ESMTP id f7M7Skh05589; Wed, 22 Aug 2001 00:28:46 -0700 (PDT) Message-ID: <3B835F58.68534CCE@mindspring.com> Date: Wed, 22 Aug 2001 00:29:28 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Mitsuru IWASAKI Cc: peter@wemm.org, arch@FreeBSD.ORG, audit@FreeBSD.ORG, kumabu@t3.rim.or.jp Subject: Re: CFR: Timing to enable CR4.PGE bit References: <20010809035801V.iwasaki@jp.FreeBSD.org> <20010817072149.0BCD63811@overcee.netplex.com.au> <20010822020634P.iwasaki@jp.FreeBSD.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Mitsuru IWASAKI wrote: > > This part is fine. > > OK, I'll commit this one first. What does setting PGE early do for you? I use PGE to avoid TLB shootdown on a number of memory regions shared between user and kernel space (including zero system call time functions), but setting it early seems wrong. Specifically, the conceptual idea is to make a VM that looks exactly like real memory, with the smallest relocation code chunk possible, so that as much as possible can be done in C code, and there's as little strangeness as possible (e.g. the evil that is machdep.c, and the "magic" numbers in pmap.h that have to match exactly the magic address at which the kernel gets linked, and have to be offset exactly by the SMP pages and other "off by one" hidden values). > > However: > > > > > Also I have another thing to be confirmed. Should we utilize TLB by > > > enabling PGE bit at very later stage? I think it would be more > > > efficient to cache page entries with G flag in multi-user environment, > > > not in kernel bootstrap. If we enable PGE bit in locore.s, TLB could > > > be occupied by entries which is referenced by initialization code > > > (yes, most of them are executed only once). > > > # but I could be wrong... PGE might be useful for shared libraries. It's set on the kernel itself, which means that trapping to kernel mode does not end up costing unnecessary overhead. It's kind of ugly, when the 4M page is set on the kernel, which loses the page table page for the 4k pages (yuck), and it's not nice for the case where the kernel gets larger than 4M. From a practical point of view, the hassle of having to set and unset a bit in CR3 to cause the TLB shootdown to occur is not really worth setting the PGE bit so early that you do not have most of the PTE's set up. > > The G bit does not "lock" the TLB entries in. All it does is stop > > unnecessary flushes when %cr3 is changed. If entries are not used > > for a short while, they will be recycled when the TLB slot is needed > > for something else soon enough. ie: this should not be a problem. It also stops necessary ones, unless you bounce it off, hit CR3, and bounce it back on... that's the strange code around the 4M page enable code. > My point is that users need higher system performance in multi-user > environment rather than in kernel bootstrap. Also PGE bit has effects > in multi-user environment where %cr3 is changed frequently. > I think enabling PGE in early stage of kernel bootstrap won't give us > performance advantages, entries which is used in bootstrap will remain > in the TLB as Intel's document says; > ---- > 3.7. TRANSLATION LOOKASIDE BUFFERS (TLBS) > [snip] > When the processor loads a page-directory or page-table entry for a > global page into a TLB, the entry will remain in the TLB indefinitely. > The only way to deterministically invalidate global page entries is to > clear the PGE flag and then invalidate the TLBs or to use the INVLPG > instruction to invalidate individual page-directory or page-table > entries in the TLBs. > ---- The INVLPG doesn't work exactly like you think it should, with PGE on, on more recent processors, unfortunately. > According to i386/locore.s, it seems that PTEs for kernel text, data, > bss and symbols have PG_G bit, I worry that it is enough many to fill > TLB slot out... The kernel is in a 4M page in most cases, so it's not an issue in most cases. It's really very important that you not have to flush in the case of a kernel entry (interrupt, system call, etc.), since it _will_ make a protection domain crossing significantly more expensive. Also, note that the 4M pages are in a seperate 8 entry conflict domain, and aren't in the same 16 entry data or 16 entry instruction TLB's, on every processor where they are supported, so the kernel is not competing with user space code anyway. NB: 4M pages only make sense in certain specific limited situations... using up 4M chunks of KVA space is generally a bad idea, unless the objects you are using them for are really 4M or larger in size. This is particularly true on 4G machines, where you really don't have any sparseness to burn on unused pages, and can't afford to use the remainder space without the same mapping you used for the rest of it (e.g. for libc.so, a copy-on-write page that is also executable, unless you split the code and data across the page boundary). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message