Date: Tue, 5 Feb 2013 09:12:41 -0800 From: Neel Natu <neelnatu@gmail.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: alc@freebsd.org, davide@freebsd.org, hackers@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] Message-ID: <CAFgRE9GMeY4dVAzqUsHz2emo82dVODBDw2xYJMcPmxxTm6Rx=g@mail.gmail.com> In-Reply-To: <20130205151413.GL2522@kib.kiev.ua> References: <CAFgRE9F4JMutV9jJ_m7_9va67xiX4YXMT%2BRm6rUoDPMPymsg4w@mail.gmail.com> <20130205151413.GL2522@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Konstantin, On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov <kostikbel@gmail.com> wrote: > On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: >> Hi, >> >> I have a patch to dynamically calculate NKPT for amd64 kernels. This >> should fix the various issues that people pointed out in the email >> thread. >> >> Please review and let me know if there are any objections to committing this. >> >> Also, thanks to Alan (alc@) for reviewing and providing feedback on >> the initial version of the patch. >> >> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt): >> >> Index: sys/amd64/include/pmap.h >> =================================================================== >> --- sys/amd64/include/pmap.h (revision 246277) >> +++ sys/amd64/include/pmap.h (working copy) >> @@ -113,13 +113,7 @@ >> ((unsigned long)(l2) << PDRSHIFT) | \ >> ((unsigned long)(l1) << PAGE_SHIFT)) >> >> -/* Initial number of kernel page tables. */ >> -#ifndef NKPT >> -#define NKPT 32 >> -#endif >> - >> #define NKPML4E 1 /* number of kernel PML4 slots */ >> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ >> >> #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ >> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ >> @@ -181,6 +175,7 @@ >> #define PML4map ((pd_entry_t *)(addr_PML4map)) >> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) >> >> +extern int nkpt; /* Initial number of kernel page tables */ >> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ >> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ >> >> Index: sys/amd64/amd64/minidump_machdep.c >> =================================================================== >> --- sys/amd64/amd64/minidump_machdep.c (revision 246277) >> +++ sys/amd64/amd64/minidump_machdep.c (working copy) >> @@ -232,7 +232,7 @@ >> /* Walk page table pages, set bits in vm_page_dump */ >> pmapsize = 0; >> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >> kernel_vm_end); ) { >> /* >> * We always write a page, even if it is zero. Each >> @@ -364,7 +364,7 @@ >> /* Dump kernel page directory pages */ >> bzero(fakepd, sizeof(fakepd)); >> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >> kernel_vm_end); va += NBPDP) { >> i = (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); >> >> Index: sys/amd64/amd64/pmap.c >> =================================================================== >> --- sys/amd64/amd64/pmap.c (revision 246277) >> +++ sys/amd64/amd64/pmap.c (working copy) >> @@ -202,6 +202,10 @@ >> vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */ >> vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ >> >> +int nkpt; >> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, >> + "Number of kernel page table pages allocated on bootup"); >> + >> static int ndmpdp; >> static vm_paddr_t dmaplimit; >> vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS; >> @@ -495,17 +499,42 @@ >> >> CTASSERT(powerof2(NDMPML4E)); >> >> +/* number of kernel PDP slots */ >> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) >> + >> static void >> +nkpt_init(vm_paddr_t addr) >> +{ >> + int pt_pages; >> + >> +#ifdef NKPT >> + pt_pages = NKPT; >> +#else >> + pt_pages = howmany(addr, 1 << PDRSHIFT); >> + pt_pages += NKPDPE(pt_pages); >> + >> + /* >> + * Add some slop beyond the bare minimum required for bootstrapping >> + * the kernel. >> + * >> + * This is quite important when allocating KVA for kernel modules. >> + * The modules are required to be linked in the negative 2GB of >> + * the address space. If we run out of KVA in this region then >> + * pmap_growkernel() will need to allocate page table pages to map >> + * the entire 512GB of KVA space which is an unnecessary tax on >> + * physical memory. >> + */ >> + pt_pages += 4; /* 8MB additional slop for kernel modules */ > 8MB might be to low. I just checked one of my machines with fully > modularized kernel, it takes slightly more than 6 MB to load 50 modules. > I think that 16MB would be safer, but it probably needs to be scaled > down based on the available phys memory. amd64 kernel could be booted > on 128MB machine still. Sounds fine. I can bump it up to 8 pages. Also, wrt your comment about scaling this number based on available memory, I wonder if it makes sense to optimize for 16KB of additional space. I would much rather work with you and Alan to fix pmap_growkernel() so we don't need to care about this slack in the first place :-) best Neel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFgRE9GMeY4dVAzqUsHz2emo82dVODBDw2xYJMcPmxxTm6Rx=g>