Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 Feb 2015 17:15:32 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        "K. Macy" <kmacy@freebsd.org>
Cc:        John Baldwin <john@baldwin.cx>, Alan Cox <alc@freebsd.org>, John-Mark Gurney <jmg@funkthat.com>, Konstantin Belousov <kib@freebsd.org>, "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   Re: getting NUMA into the tree (userland most interesting for me)
Message-ID:  <CAJ-Vmok4peyq95o7%2BT7EkEEVb2ZqU3Y0pd_9kTMyBrxuhvX05w@mail.gmail.com>
In-Reply-To: <CAHM0Q_NdiGUD35Fx3%2B%2B=mtZjHdj9qDTSRCXwgUV4vSCb6z4ATA@mail.gmail.com>
References:  <20150219041012.GJ1953@funkthat.com> <CAHM0Q_NXfN-1jrBEOkQPw67fqL8yp9XBq8PUzJAB6nt89=GvrA@mail.gmail.com> <83795148.GHHzUeRKp6@ralph.baldwin.cx> <CAHM0Q_NdiGUD35Fx3%2B%2B=mtZjHdj9qDTSRCXwgUV4vSCb6z4ATA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 19 February 2015 at 14:49, K. Macy <kmacy@freebsd.org> wrote:
>>> I personally don't think the infrastructure is far enough along that
>>> this is near to be an interesting value proposition. However, that
>>> said, I do believe that maintaining linux compatibility is important.
>>> Thus I would be for adding it to the linux compatibility layer and
>>> export it on the FreeBSD API side purely as an SPI until consensus is
>>> reached.
>>
>> Yes, I think we have a fair bit to do in the kernel before we are in a
>> position to export anything truly useful to userland unfortunately.  The last
>> time I talked with Jeff about projects/numa (after the first draft of the wiki
>> page) I came away with the impression that there might be some things we can
>> pull out of that branch, but that it isn't suitable for merging upstream
>> directly.  Jeff noted that he and Alan had gone through several iterations of
>> this already (I believe at least 3 completely different policy designs) all of
>> which had their own issues.
>>
>> Outside of the VM I think that we can keep the APIs somewhat stable by having
>> this opaque policy cookie to pass around that we can redefine the guts of
>> later.  However, various parts of the VM all have to handle whatever the
>> policy defines, and while the vm_phys bits and contigmalloc() might be kind of
>> obvious to implement, higher level VM layers like kmem() and malloc() are more
>> complicated.  One thing that is in projects/numa is changes for UMA that we
>> can hopefully reuse much of, but I don't recall how much (if any) of
>> kmem/malloc is in there.  Also, while vm_phys is one of the first things to
>> do, I know that Alan and Jeff have pending patches to remove the cache queue
>> (since it is far less useful than it seems) which simplify vm_phys making it
>> easier to implement NUMA policies there, so I'm hoping we can get that in
>> sooner before having to start tearing up the VM too much.  This is why the
>> stuff I currently have is targeted non-VM bits like interrupts as getting that
>> correct is lower-hanging fruit that might provide some gains regardless.  Even
>> once vm_phys is done I think the first thing to tackle next is contigmalloc to
>> facilitate static bus_dma allocations (descriptor rings and such) being local
>> to a device.
>>
>
> Contigmalloc improvements and cache queue removal are in the
> phabricator queue now. They are also prerequisites for per-cpu free
> page caches which are a huge scalability improvement for some
> workloads such as Netflix's.
>
> There is still a fair amount of scalability work  (including Jeffr's
> per-domain pagedaemon work) that really needs to happens before we can
> seriously think about a general user-level NUMA interface.

Is there anything wrong with maybe bringing over the basic low level
allocator changes from projects/numa so the basics are there?



-adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmok4peyq95o7%2BT7EkEEVb2ZqU3Y0pd_9kTMyBrxuhvX05w>