Date: Thu, 19 Feb 2015 17:40:58 -0500 From: John Baldwin <john@baldwin.cx> To: freebsd-arch@freebsd.org Cc: Alan Cox <alc@freebsd.org>, John-Mark Gurney <jmg@funkthat.com>, Konstantin Belousov <kib@freebsd.org>, "K. Macy" <kmacy@freebsd.org> Subject: Re: getting NUMA into the tree (userland most interesting for me) Message-ID: <83795148.GHHzUeRKp6@ralph.baldwin.cx> In-Reply-To: <CAHM0Q_NXfN-1jrBEOkQPw67fqL8yp9XBq8PUzJAB6nt89=GvrA@mail.gmail.com> References: <20150219041012.GJ1953@funkthat.com> <CAHM0Q_NXfN-1jrBEOkQPw67fqL8yp9XBq8PUzJAB6nt89=GvrA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, February 19, 2015 01:32:13 PM K. Macy wrote: > On Wed, Feb 18, 2015 at 8:10 PM, John-Mark Gurney <jmg@funkthat.com> wrote: > > I would like to help drive getting NUMA into the tree. Specificly, > > getting userland allocations to be done from a specified domain. > > > > I've looked at the projects/numa tree, but it appears that not much was > > done to get userland mappings to be NUMA aware. > > > > How are we going to do this? Do people have code to do this? > > > > I've looked at how Linux does this, at least from a programming > > interface. They use mmap to create the mapping, and then use the call > > mbind to tell the kernel where to handle the allocations. Is this > > what people are thinking? > > > > I've checked the wiki status, and the userland section is quite > > empty. > > I personally don't think the infrastructure is far enough along that > this is near to be an interesting value proposition. However, that > said, I do believe that maintaining linux compatibility is important. > Thus I would be for adding it to the linux compatibility layer and > export it on the FreeBSD API side purely as an SPI until consensus is > reached. Yes, I think we have a fair bit to do in the kernel before we are in a position to export anything truly useful to userland unfortunately. The last time I talked with Jeff about projects/numa (after the first draft of the wiki page) I came away with the impression that there might be some things we can pull out of that branch, but that it isn't suitable for merging upstream directly. Jeff noted that he and Alan had gone through several iterations of this already (I believe at least 3 completely different policy designs) all of which had their own issues. Outside of the VM I think that we can keep the APIs somewhat stable by having this opaque policy cookie to pass around that we can redefine the guts of later. However, various parts of the VM all have to handle whatever the policy defines, and while the vm_phys bits and contigmalloc() might be kind of obvious to implement, higher level VM layers like kmem() and malloc() are more complicated. One thing that is in projects/numa is changes for UMA that we can hopefully reuse much of, but I don't recall how much (if any) of kmem/malloc is in there. Also, while vm_phys is one of the first things to do, I know that Alan and Jeff have pending patches to remove the cache queue (since it is far less useful than it seems) which simplify vm_phys making it easier to implement NUMA policies there, so I'm hoping we can get that in sooner before having to start tearing up the VM too much. This is why the stuff I currently have is targeted non-VM bits like interrupts as getting that correct is lower-hanging fruit that might provide some gains regardless. Even once vm_phys is done I think the first thing to tackle next is contigmalloc to facilitate static bus_dma allocations (descriptor rings and such) being local to a device. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?83795148.GHHzUeRKp6>
