From owner-freebsd-hackers Sat Sep 21 1:16:17 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 27D2137B401 for ; Sat, 21 Sep 2002 01:16:14 -0700 (PDT) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.187]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5DA1B43E3B for ; Sat, 21 Sep 2002 01:16:13 -0700 (PDT) (envelope-from phoenix@minion.de) Received: from [212.227.126.155] (helo=mrelayng1.kundenserver.de) by moutng.kundenserver.de with esmtp (Exim 3.35 #1) id 17sfQq-00026M-00; Sat, 21 Sep 2002 10:16:04 +0200 Received: from [80.144.35.136] (helo=chronos) by mrelayng1.kundenserver.de with asmtp (Exim 3.35 #1) id 17sfQq-0005zM-00; Sat, 21 Sep 2002 10:16:04 +0200 Received: from phoenix by chronos with local (Exim 3.35 #1 (Debian)) id 17scOu-0001pe-00; Sat, 21 Sep 2002 07:01:52 +0200 Date: Sat, 21 Sep 2002 07:01:52 +0200 From: Christian Zander To: "M. Warner Losh" Cc: dominic_marks@btinternet.com, jamie@jamiesdomain.org.uk, freebsd-hackers@FreeBSD.ORG Subject: Problem with Mapping System Memory to User Space (Re: Kernel - Modules and Compiled in) Message-ID: <20020921070152.T4788@chronos> Reply-To: Christian Zander References: <001001c25d36$a3672be0$83bf83d5@BONG> <20020915203835.GA3497@gallium> <20020921.004207.103236464.imp@bsdimp.com> <20020921065009.S4788@chronos> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020921065009.S4788@chronos> User-Agent: Mutt/1.3.22.1i X-Operating-System: GNU/Linux [2.4.19][i686] Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Sep 21, 2002 at 06:50:09AM +0200, Christian Zander wrote: > > Maybe it helps to get an idea of what memory allocation sizes we > are talking about for the NVIDIA driver. For every single OpenGL > client in the system memory case (no AGP), the resource manager > has to allocate ~1MB (in multiple chunks; 258, 1, 8 pages). > > > Actually, this issues get gross in a hurry, which is why no one > > has done it. :-( > > > > There is a similar interface on Linux, the bigphys patch; it is > really only useful to set aside larger chunks of contiguous > physical memory for special-case drivers rather than for daily > life. Allocating dynamically from a static block of memory set > aside a boot time would quickly grow into a major pain for this > specific driver due to the numerous smaller allocations. This > works well enough for AGP memory since all AGP allocations are > much larger (258, 2304 pages) and sized in multiples of 1MB. > > Even in the AGP memory case, several pages of DMA memory need to > be allocated from general system memory. > I'm sorry, this is in reply of another issue I had meant to bring up on freebsd-hackers, and that is related to memory allocations for drivers designed similar to the NVIDIA graphics driver. For architectural reasons, the NVIDIA driver relies on the ability to allocate DMA memory in the kernel and to map it into user-space. While the allocation is no problem, the FreeBSD mmap architecture imposes limitations on the driver, since its implementation of the mmap system call is called for individual ranges rather than for a range of pages (as in the Linux implementation). In order for the driver to be able to recognize any incoming offset correctly, the mmap() offset passed to the kernel from user-space must be the base of a range of addresses that is unique across the system. With AGP memory, this requirement is easily satisfied (*) since all (possibly non-contiguous) allocations have a contiguous alias in the AGP aperture. It is thus possible to provide the AGP base address of an allocation as the mmap offset. With general system memory, this is problematic, unless the driver allocates the memory it requires as a contiguous block of physical memory; this is feasable for very small memory allocations only. In order to support large DMA memory buffers, the driver must thus associate a contiguous range of addresses with the non-contiguous pages it allocates from system memory. This alias won't be physical (unlike AGP) and it won't permanently correspond to the pages. Assuming the driver went ahead and allocated the memory it requires from kernel virtual memory, this allocation's base address could be used as a mmap offset; the driver would be able to recognize all of the pages and retrieve their physical addresses correctly. This is the approach taken by the NVIDIA driver at this point. The problem with this lies with the assumption that any given mmap offset will always correspond to the same physical address; having retrieved the address for a given offset, dev_pager_getpages will install a fake page and thus a cached offset-address mapping. If a process allocates a piece of DMA memory (ioctl), it is returned an offset corresponding to the allocation (its kernel virtual address) and will use that as an offset to mmap. The first time around, this will work as expected. Assuming that the driver might then free and re-allocate the memory, however, it may happen that it will attempt to map a set of different physical pages with the same offset. Based on the fake page installed by dev_pager_getpages, vm_fault is led to believe that the page in question is resident and returns the cached, now outdated (incorrect) physical address for the offset. The proposed workaround aims to be effective and non-intrusive; the idea is to extend the msync system call to support invalidating of cached pages for objects of type OBJT_DEVICE. This solutions appears to work well, but I'm aware that there might be better solutions for the problem, or concerns with this proposed solution. (*) This is true for AGP core logics currently used in i386 systems, but this may change with future implemtations of AGP 3.0. It is not true on PPC or ia64, AGP chipsets such as the 460GX (Itanium) do not translat CPU accesses to AGP memory through the aperture. diff -ru nvidia/sys/4.5/vm/vm_map.c sys_/vm/vm_map.c --- nvidia/sys/4.5/vm/vm_map.c Fri Mar 8 09:22:20 2002 +++ sys_/vm/vm_map.c Wed Jun 12 18:45:45 2002 @@ -1775,14 +1775,17 @@ OFF_TO_IDX(offset), OFF_TO_IDX(offset + size + PAGE_MASK), flags); - if (invalidate) { - /*vm_object_pip_wait(object, "objmcl");*/ - vm_object_page_remove(object, - OFF_TO_IDX(offset), - OFF_TO_IDX(offset + size + PAGE_MASK), - FALSE); - } VOP_UNLOCK(object->handle, 0, curproc); + vm_object_deallocate(object); + } + if (object && invalidate && + ((object->type == OBJT_VNODE) || + (object->type == OBJT_DEVICE))) { + vm_object_reference(object); + vm_object_page_remove(object, + OFF_TO_IDX(offset), + OFF_TO_IDX(offset + size + PAGE_MASK), + FALSE); vm_object_deallocate(object); } start += size; -- christian zander zander@minion.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message