Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Aug 2012 00:25:14 -0500
From:      Alan Cox <alc@rice.edu>
To:        "Jayachandran C." <c.jayachandran@gmail.com>
Cc:        mips@freebsd.org
Subject:   Re: mips pmap patch
Message-ID:  <503DA7BA.3030102@rice.edu>
In-Reply-To: <CA%2B7sy7AbGvwu8UMhtOO-vX1b2gdhQQpb3wfmOuci3UNEZ8Z7EQ@mail.gmail.com>
References:  <50228F5C.1000408@rice.edu> <CA%2B7sy7DxqhGhJt%2BwE3WW2-j4SxnPweULjYS5GQ=NgMYSrwJHtw@mail.gmail.com> <50269AD4.9050804@rice.edu> <CA%2B7sy7AZ-s2H6COfvz60N=kxw%2BWUiUC9diVfWg9aOzWSZKGWRQ@mail.gmail.com> <5029635A.4050209@rice.edu> <CA%2B7sy7Cnsy7Ag1iG=_Kj04gEXeYp7kZnpACQpD8THvkp0VKdcA@mail.gmail.com> <502D2271.6080105@rice.edu> <CA%2B7sy7CK=EXu88XKYYXDV1uf3U7eebq3e6rfwgHRhQyFTMv7dQ@mail.gmail.com> <50325DC3.3090201@rice.edu> <CA%2B7sy7AbGvwu8UMhtOO-vX1b2gdhQQpb3wfmOuci3UNEZ8Z7EQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 08/27/2012 10:24, Jayachandran C. wrote:
> On Mon, Aug 20, 2012 at 9:24 PM, Alan Cox<alc@rice.edu>  wrote:
>> On 08/20/2012 05:36, Jayachandran C. wrote:
>>> On Thu, Aug 16, 2012 at 10:10 PM, Alan Cox<alc@rice.edu>   wrote:
>>>> On 08/15/2012 17:21, Jayachandran C. wrote:
>>>>> On Tue, Aug 14, 2012 at 1:58 AM, Alan Cox<alc@rice.edu>    wrote:
>>>>>> On 08/13/2012 11:37, Jayachandran C. wrote:
>>> [...]
>>>>>>> I could not test for more than an hour on 32-bit due to another
>>>>>>> problem (freelist 1 containing direct-mapped pages runs out of pages
>>>>>>> after about an hour of compile test).  This issue has been there for a
>>>>>>> long time, I am planning to look at it when I get a chance.
>>>>>>>
>>>>>> What exactly happens?  panic?  deadlock?
>>>>> The build slows down to a crawl and hangs when it runs out of pages in
>>>>> the freelist.
>>>>
>>>> I'd like to see the output of "sysctl vm.phys_segs" and "sysctl
>>>> vm.phys_free" from this machine.  Even better would be running "sysctl
>>>> vm.phys_free" every 60 seconds during the buildworld.  Finally, I'd like
>>>> to
>>>> know whether or not either "ps" or "top" shows any threads blocked on the
>>>> "swwrt" wait channel once things slow to a crawl.
>>> I spent some time looking at this issue.  I use a very large kernel
>>> image with built-in root filesystem, and this takes about 120 MB  out
>>> of the direct mapped area. The remaining pages (~64 MB) are not enough
>>> for the build process.  If I increase free memory in this area either
>>> by reducing the rootfs size of by adding a few more memory segments to
>>> this area, the build goes through fine.
>>
>> I'm still curious to see what "sysctl vm.phys_segs" says.  It sounds like
>> roughly half of the direct map region is going to DRAM and half to
>> memory-mapped I/O devices.  Is that correct?
> Yes, about half of the direct mapped region in 32-bit is taken by
> flash, PCIe and other memory mapped IO.  I also made the problem even
> worse by not reclaiming some bootloader areas in the direct mapped
> region, which reduced the available direct mapped memory.
>
> Here's the output of sysctls:
>
> root@testboard:/root # sysctl vm.phys_segs
> vm.phys_segs:
> SEGMENT 0:
>
> start:     0x887e000
> end:       0xc000000
> domain:    0
> free list: 0x887a407c
>
> SEGMENT 1:
>
> start:     0x1d000000
> end:       0x1fc00000
> domain:    0
> free list: 0x887a407c
>
> SEGMENT 2:
>
> start:     0x20000000
> end:       0xbc0b3000
> domain:    0
> free list: 0x887a3f38
>
> SEGMENT 3:
>
> start:     0xe0000000
> end:       0xfffff000
> domain:    0
> free list: 0x887a3f38
>
> root@testboard:/root # sysctl vm.phys_free
> vm.phys_free:
> FREE LIST 0:
>
>    ORDER (SIZE)  |  NUMBER
>                  |  POOL 0  |  POOL 1  |  POOL 2
> --            -- --      -- --      -- --      --
>     8 (  1024K)  |    2877  |       0  |       0
>     7 (   512K)  |       0  |       1  |       0
>     6 (   256K)  |       1  |       0  |       0
>     5 (   128K)  |       0  |       1  |       0
>     4 (    64K)  |       0  |       1  |       0
>     3 (    32K)  |       0  |       1  |       0
>     2 (    16K)  |       0  |       1  |       0
>     1 (     8K)  |       0  |       0  |       0
>     0 (     4K)  |       0  |       0  |       0
>
> FREE LIST 1:
>
>    ORDER (SIZE)  |  NUMBER
>                  |  POOL 0  |  POOL 1  |  POOL 2
> --            -- --      -- --      -- --      --
>     8 (  1024K)  |      66  |       0  |       0
>     7 (   512K)  |       1  |       1  |       0
>     6 (   256K)  |       0  |       0  |       0
>     5 (   128K)  |       0  |       0  |       0
>     4 (    64K)  |       0  |       1  |       0
>     3 (    32K)  |       0  |       0  |       0
>     2 (    16K)  |       0  |       0  |       0
>     1 (     8K)  |       1  |       1  |       0
>     0 (     4K)  |       0  |       1  |       0
>
>>> I also found that when the build slows down, most of the pages taken
>>> from freelist 1 are allocated by the UMA subsystem, which seems to
>>> keep quite a few pages allocated.
>>
>> At worst, it may be necessary to disable the use of uma_small_alloc() for
>> this machine configuration.  At best, uma_small_alloc() could be revised
>> opportunistically use pages in the direct map region, but have the ability
>> to fall back to pages that have to be mapped.
> I think this probably is not a bug, but a configuration problem (we
> cannot have such a huge built-in root filesystem when the direct
> mapped area is at this low).  Anyway, I have checked in code to
> recover more areas from the bootloader, and this mostly solves the
> issue for me.  The above output is taken before the check-in.

I'm afraid that exhaustion of freelist 1 is still highly likely to occur 
under some workloads that require the allocation of a lot of small 
objects in the kernel's heap.

Alan




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?503DA7BA.3030102>