Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 21 Sep 2002 07:01:52 +0200
From:      Christian Zander <zander@minion.de>
To:        "M. Warner Losh" <imp@bsdimp.com>
Cc:        dominic_marks@btinternet.com, jamie@jamiesdomain.org.uk, freebsd-hackers@FreeBSD.ORG
Subject:   Problem with Mapping System Memory to User Space (Re: Kernel - Modules and Compiled in)
Message-ID:  <20020921070152.T4788@chronos>
In-Reply-To: <20020921065009.S4788@chronos>
References:  <001001c25d36$a3672be0$83bf83d5@BONG> <20020915203835.GA3497@gallium> <20020921.004207.103236464.imp@bsdimp.com> <20020921065009.S4788@chronos>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Sep 21, 2002 at 06:50:09AM +0200, Christian Zander wrote:
> 
> Maybe it helps to get an idea of what memory allocation sizes we
> are talking about for the NVIDIA driver. For every single OpenGL
> client in the system memory case (no AGP), the resource manager
> has to allocate ~1MB (in multiple chunks; 258, 1, 8 pages).
> 
> > Actually, this issues get gross in a hurry, which is why no one
> > has done it. :-(
> > 
> 
> There is a similar interface on Linux, the bigphys patch; it is
> really only useful to set aside larger chunks of contiguous
> physical memory for special-case drivers rather than for daily
> life. Allocating dynamically from a static block of memory set
> aside a boot time would quickly grow into a major pain for this
> specific driver due to the numerous smaller allocations. This
> works well enough for AGP memory since all AGP allocations are
> much larger (258, 2304 pages) and sized in multiples of 1MB.
> 
> Even in the AGP memory case, several pages of DMA memory need to
> be allocated from general system memory.
> 

I'm sorry, this is in reply of another issue I had meant to bring
up on freebsd-hackers, and that is related to memory allocations
for drivers designed similar to the NVIDIA graphics driver.


For architectural reasons, the NVIDIA driver relies on the ability
to allocate DMA memory in the kernel and to map it into user-space.
While the allocation is no problem, the FreeBSD mmap architecture
imposes limitations on the driver, since its implementation of the
mmap system call is called for individual ranges rather than for a
range of pages (as in the Linux implementation).

In order for the driver to be able to recognize any incoming offset
correctly, the mmap() offset passed to the kernel from user-space
must be the base of a range of addresses that is unique across the
system.

With AGP memory, this requirement is easily satisfied (*) since all
(possibly non-contiguous) allocations have a contiguous alias in the
AGP aperture. It is thus possible to provide the AGP base address of
an allocation as the mmap offset.

With general system memory, this is problematic, unless the driver
allocates the memory it requires as a contiguous block of physical
memory; this is feasable for very small memory allocations only.

In order to support large DMA memory buffers, the driver must thus
associate a contiguous range of addresses with the non-contiguous
pages it allocates from system memory. This alias won't be physical
(unlike AGP) and it won't permanently correspond to the pages.

Assuming the driver went ahead and allocated the memory it requires
from kernel virtual memory, this allocation's base address could be
used as a mmap offset; the driver would be able to recognize all of
the pages and retrieve their physical addresses correctly.  This is
the approach taken by the NVIDIA driver at this point.

The problem with this lies with the assumption that any given mmap
offset will always correspond to the same physical address; having
retrieved the address for a given offset, dev_pager_getpages will
install a fake page and thus a cached offset-address mapping. If a
process allocates a piece of DMA memory (ioctl), it is returned an
offset corresponding to the allocation (its kernel virtual address)
and will use that as an offset to mmap. The first time around, this
will work as expected. Assuming that the driver might then free and
re-allocate the memory, however, it may happen that it will attempt
to map a set of different physical pages with the same offset.

Based on the fake page installed by dev_pager_getpages, vm_fault is
led to believe that the page in question is resident and returns the
cached, now outdated (incorrect) physical address for the offset.


The proposed workaround aims to be effective and non-intrusive; the
idea is to extend the msync system call to support invalidating of
cached pages for objects of type OBJT_DEVICE. This solutions appears
to work well, but I'm aware that there might be better solutions for
the problem, or concerns with this proposed solution.


(*) This is true for AGP core logics currently used in i386 systems,
but this may change with future implemtations of AGP 3.0. It is not
true on PPC or ia64, AGP chipsets such as the 460GX (Itanium) do not
translat CPU accesses to AGP memory through the aperture.


diff -ru nvidia/sys/4.5/vm/vm_map.c sys_/vm/vm_map.c
--- nvidia/sys/4.5/vm/vm_map.c  Fri Mar  8 09:22:20 2002
+++ sys_/vm/vm_map.c    Wed Jun 12 18:45:45 2002
@@ -1775,14 +1775,17 @@
                OFF_TO_IDX(offset),
                OFF_TO_IDX(offset + size + PAGE_MASK),
                flags);
-           if (invalidate) {
-               /*vm_object_pip_wait(object, "objmcl");*/
-               vm_object_page_remove(object,
-                   OFF_TO_IDX(offset),
-                   OFF_TO_IDX(offset + size + PAGE_MASK),
-                   FALSE);
-           }
            VOP_UNLOCK(object->handle, 0, curproc);
+           vm_object_deallocate(object);
+       }
+       if (object && invalidate &&
+           ((object->type == OBJT_VNODE) ||
+            (object->type == OBJT_DEVICE))) {
+           vm_object_reference(object);
+           vm_object_page_remove(object,
+               OFF_TO_IDX(offset),
+               OFF_TO_IDX(offset + size + PAGE_MASK),
+               FALSE);
            vm_object_deallocate(object);
        }
        start += size;


-- 
christian zander
zander@minion.de

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020921070152.T4788>