Date: Mon, 07 Jun 2010 13:49:36 -0500 From: Alan Cox <alc@cs.rice.edu> To: "C. Jayachandran" <c.jayachandran@gmail.com> Cc: "Jayachandran C." <jchandra@freebsd.org>, Konstantin Belousov <kostikbel@gmail.com>, Alan Cox <alc@cs.rice.edu>, mips@freebsd.org Subject: Re: svn commit: r208589 - head/sys/mips/mips Message-ID: <4C0D3F40.2070101@cs.rice.edu> In-Reply-To: <AANLkTikZxx_30H9geHvZYkYd0sE-wiuZljEd0PAi14ca@mail.gmail.com> References: <201005271005.o4RA5eVu032269@svn.freebsd.org> <4C058145.3000707@cs.rice.edu> <AANLkTil955Ek-a3tek4Ony9NqrK5l2j7lNA4baRVPBzb@mail.gmail.com> <4C05F9BC.40409@cs.rice.edu> <AANLkTikdnvXsTwm8onl3MZPQmVfV-2GovB9--KMNnMgC@mail.gmail.com> <4C0731A5.7090301@cs.rice.edu> <AANLkTimIa3jmBPMhWIOcY6DenGpZ2ZYmqwDTWspVx0-u@mail.gmail.com> <AANLkTil2gE1niUWCHnsTlQvibhxBh7QYwD0TTWo0rj5c@mail.gmail.com> <AANLkTinA2D5iTDGPbflHVzLyAZW-ZewjJkUWWL8FVskr@mail.gmail.com> <4C07E07B.9060802@cs.rice.edu> <AANLkTimjyPc_AXKP1yaJaF1BN7CAGBeNikVzcp9OCb4P@mail.gmail.com> <4C09345F.9040300@cs.rice.edu> <AANLkTinmFOZY3OlaoKStxlNIRBt2G2I4ILkQ1P0CjozG@mail.gmail.com> <4C0D2BEA.6060103@cs.rice.edu> <AANLkTikZxx_30H9geHvZYkYd0sE-wiuZljEd0PAi14ca@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
C. Jayachandran wrote: > On Mon, Jun 7, 2010 at 10:57 PM, Alan Cox <alc@cs.rice.edu> wrote: > >> C. Jayachandran wrote: >> >>> On Fri, Jun 4, 2010 at 10:44 PM, Alan Cox <alc@cs.rice.edu> wrote: >>> >>> >>>> C. Jayachandran wrote: >>>> >>>> >>>>> On Thu, Jun 3, 2010 at 10:33 PM, Alan Cox <alc@cs.rice.edu> wrote: >>>>> >>>>> >>>>> >>>> [snip] >>>> >>>> >>>>>> Add a static counter to pmap_ptpgzone_allocf(). When the counter >>>>>> reaches >>>>>> some not-too-small prime number, don't call vm_phys_alloc_contig(), >>>>>> instead >>>>>> set "m" to NULL and reset the counter to zero. Setting "m" to NULL >>>>>> should >>>>>> then trigger the vm_contig_grow_cache() call. Make sense? >>>>>> >>>>>> >>>>>> >>>>> Is this to move the vm_contig_grow_cache() out of the >>>>> pmap_ptpgzone_allocf() and into the function calling uma_zalloc()? I >>>>> am wondering why the prime number is needed. >>>>> >>>>> Another implementation I had thought about was to have a kernel >>>>> process maintain a pool of direct mapped pages for MIPS. This will be >>>>> woken up if the pool goes below a low water mark - any comments on >>>>> this? >>>>> >>>>> >>>>> >>>>> >>>> Here is how the page table page allocation should be done. It's not much >>>> harder than the vm_contig_grow_cache() change. >>>> >>>> 1. Look at vm/vm_phys.c, in particular, vm_phys_init(). Create a new >>>> freelist, like VM_FREELIST_ISADMA, for the direct-mapped memory on >>>> MIPS. Perhaps, call it VM_FREELIST_LOWMEM. The controlling macros >>>> should >>>> be defined in mips/include/vmparam.h. >>>> >>>> 2. Trivially refactor vm_phys_alloc_pages(). Specifically, take the body >>>> of >>>> the outer for loop and make it a new function, vm_phys_alloc_freelist(), >>>> that takes the variable "flind" as a parameter. >>>> >>>> 3. Eliminate the UMA zone for page table pages. In place of >>>> vm_phys_alloc_contig() call your new function with VM_FREELIST_LOWMEM. >>>> (The >>>> vm_contig_grow_cache() remains unchanged.) Go back to calling >>>> vm_page_free() to release page table pages. >>>> >>>> For the record, the changes that you made to start using a zone for page >>>> table page allocation introduced at least one possible race condition >>>> between pmap_remove_pages() and pmap_remove_all(). Specifically, by >>>> dropping and reacquiring the page queues and pmap lock when you free a >>>> page >>>> table page, you allow a pmap_remove_all() call while pmap_remove_pages() >>>> is >>>> running to leave the variable "npv" in pmap_remove_pages() pointing at a >>>> freed pv entry. >>>> >>>> >>> My first attempt is attached, it comes up multiuser but crashes if I >>> stress it a bit (panic: Bad link elm 0xc0078f00 prev->next != elm). >>> Will look at this tomorrow, and see if I can find the cause. >>> >>> >>> >> Where exactly is this occurring? >> >> >>> In the meantime, it may be worth looking at the patch to see if this >>> is in line with the approach you had suggested. I have tried to use >>> VM_FREELIST_HIGHMEM which is already there, instead of adding >>> VM_FREELIST_LOWMEM. >>> >>> >>> >> Basically, yes. At first, I was taken aback by defining VM_FREELIST_DEFAULT >> as other than zero, but it shouldn't break anything. >> > > I can change this to add VM_FREELIST_LOWMEM. This works okay so far, > even though I haven't really checked the code to see if it works when > there is no memory in HIGHMEM. > > I think that it will. It will just waste a little time looking at empty queues. >> The one issue that I see with the patch is that you'll need to release and >> reacquire the pmap and page queues locks around the call to >> vm_contig_grow_cache(). However, if you release these locks, you need to >> restart the pmap_allocpte(), i.e., perform the "goto retry;", because while >> you slept another processor/thread might have already allocated and >> installed the page table page that you are trying to allocate into the page >> table. >> >> For what it's worth, this same race condition exists in the current code in >> subversion using an UMA zone, but not in the code that predates use of an >> UMA zone. >> > > I have been testing with an updated version of the patch - my -j128 > buildworld works on this. The sys/vm side changes are simple enough, > but the he page table page allocation part is still not complete. The > version I have now is (with gmail whitespace damage): > > --- > static vm_page_t > pmap_alloc_pte_page(pmap_t pmap, unsigned int index, int wait, vm_offset_t *vap) > { > vm_paddr_t paddr; > vm_page_t m; > int tries, flags; > > tries = 0; > retry: > mtx_lock(&vm_page_queue_free_mtx); > m = vm_phys_alloc_freelist_pages(VM_FREELIST_DEFAULT, > VM_FREEPOOL_DEFAULT, 0); > if (m == NULL) { > mtx_unlock(&vm_page_queue_free_mtx); > if (tries < ((wait & M_NOWAIT) != 0 ? 1 : 3)) { > vm_contig_grow_cache(tries, 0, MIPS_KSEG0_LARGEST_PHYS); > tries++; > goto retry; > } else { > printf("[%d] %s wait %x, fail!\n", tries, __func__, > wait); > return (NULL); > } > } > > m->pindex = index; > m->valid = VM_PAGE_BITS_ALL; > atomic_add_int(&cnt.v_wire_count, 1); > m->wire_count = 1; > if (m->flags & PG_ZERO) > vm_page_zero_count--; > else > pmap_zero_page(m); > flags = PG_ZERO | PG_UNMANAGED; > m->flags = flags; > m->oflags = 0; > m->act_count = 0; > if ((m->flags & PG_CACHED) != 0) > printf("Not yet! handle cached page %p flags 0x%x\n", > m, m->flags); > else > cnt.v_free_count--; > > m->act_count = 0; > mtx_unlock(&vm_page_queue_free_mtx); > } > > > ---- > > I tried to match the vm_page_alloc() code here. I'm not sure if I have > to handle the PG_CACHED case, and if it is okay to change the VM > counters here. > > You do. In the end, I suspect that this function will be added to the machine-independent layer. > Also, I will change the code to return NULL all the way and retry > allocation (similar to the old code which called VM_WAIT and then > returned NULL). > > Probably I should get the whole code in line with the old code, > replacing the old vm_page_alloc() call with the function above without > retry, and the VM_WAIT with a new function which does the > vm_contig_grow_cache(). > > You may find it easier if you move the call to vm_contig_grow_cache() to pmap_allocpte(). Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C0D3F40.2070101>