Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Jun 2012 13:43:01 -0500
From:      Nathan Whitehorn <nwhitehorn@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Ian Lepore <freebsd@damnhippie.dyndns.org>, freebsd-hackers@freebsd.org, Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl>
Subject:   Re: wired memory - again!
Message-ID:  <4FD8DF35.80105@freebsd.org>
In-Reply-To: <20120613182232.GT2337@deviant.kiev.zoral.com.ua>
References:  <alpine.BSF.2.00.1206090920030.84632@wojtek.tensor.gdynia.pl> <1339259223.36051.328.camel@revolution.hippie.lan> <20120609165217.GO85127@deviant.kiev.zoral.com.ua> <alpine.BSF.2.00.1206092244550.9248@wojtek.tensor.gdynia.pl> <1339512694.36051.362.camel@revolution.hippie.lan> <20120612204508.GP2337@deviant.kiev.zoral.com.ua> <1339593249.73426.5.camel@revolution.hippie.lan> <20120613182232.GT2337@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 06/13/12 13:22, Konstantin Belousov wrote:
> On Wed, Jun 13, 2012 at 07:14:09AM -0600, Ian Lepore wrote:
>> On Tue, 2012-06-12 at 23:45 +0300, Konstantin Belousov wrote:
>>> On Tue, Jun 12, 2012 at 08:51:34AM -0600, Ian Lepore wrote:
>>>> On Sat, 2012-06-09 at 22:45 +0200, Wojciech Puchar wrote:
>>>>>> First, all memory allocated by UMA and consequently malloc(9) is
>>>>>> wired. In other words, almost all memory used by kernel is accounted
>>>>>> as wired.
>>>>>>
>>>>> yes i understand this. still i found no way how to find out what allocated
>>>>> that much.
>>>>>
>>>>>
>>>>>> Second, the buffer cache wires the pages which are inserted into VMIO
>>>>>> buffers. So your observation is basically right, cached buffers means
>>>>> what are exactly "VMIO" buffers. i understand that page must be wired WHEN
>>>>> doing I/O.
>>>>> But i have too much wired memory even when doing no I/O at all.
>>>> I agree, this is The Big Question for me.  Why does the system keep
>>>> wired writable mappings of the buffers in kva after the IO operations
>>>> are completed?
>>> Read about buffer cache, e.g. in the Design and Implementation of
>>> the FreeBSD OS book.
>>>
>>>> If it did not do so, it would fix the instruction-cache-disabled bug
>>>> that kills performance on VIVT cache architectures (arm and mips) and it
>>>> would reduce the amount of wired memory (that apparently doesn't need to
>>>> be wired, unless I've missed the implications of a previous reply in
>>>> this thread).
>>> I have no idea what is the bug you are talking about. If my guess is
>>> right, and it specifically references unability of some processors
>>> to correctly handle several mappings of the same physical page into
>>> different virtual addresses due to cache tagging using virtual address
>>> instead of physical, then this is a hardware bug, not software.
>>>
>> This bug:
>>
>> http://lists.freebsd.org/pipermail/freebsd-arm/2012-January/003288.html
>>
>> The bug isn't the VIVT cache hardware, it's the fact that the way we
>> handle the requirements of the hardware has the side effect of leaving
>> the instruction cache bit disabled on executable pages because the
>> kernel keeps writable mappings of the pages even after the IO is done.
> Can you point me at the exact code in arm pmap ?
>
> I remember an issue on PPC which Nathan discussed, that sounds somewhat
> similar (but I still do not understand what exactly happens on ARM). On
> PowerPC, icache needs to be explicitely flushed if write happen to the
> executable mapping. See r233949 for current solution. There were some
> followups, but I believe the core change is still valid.
>
>

The general algorithm I used on PPC (which is PIPT, but still has an 
incoherent icache) is based on the following guarantees/observations, 
which seem to be sufficient for FreeBSD to run correctly:
1. Executable kernel memory never contains new data except after a 
module load, so only do kernel icache flushes in elf_machdep.c after a 
module load.
2. Executable pages are never mapped into userland until the kernel is 
finished writing to them. Thus, userland icache consistency is 
maintained with respect to all kernel operations (executable loading, 
swap, etc.) if icaches are made coherent once at the time that the page 
is entered into its first non-kernel pmap. If your chip has an NX 
feature, this only need be done for the first executable user mapping -- 
otherwise it needs to be done for the first overall mapping to prevent 
information leakage via the icache. I guess for VIVT, "first" may mean 
"every" here.
3. I-cache coherency with respect to userland modifications is the 
responsibility of the user program. All self-modifying code knows, or 
should know, what to do here. Otherwise the only time this comes up is 
in RTLD, which is easily modified to do an icache flush after load.

The general point is that even if the kernel maintains its writable 
mapping after the IO is complete, it will never actually write to that 
mapping after the page has been mapped into its first user process and 
therefore it is safe to maintain cacheability at all times and do a 
single invalidation when it is mapped into the user pmap.
-Nathan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4FD8DF35.80105>