Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 7 Mar 2015 17:35:27 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   A quick dumpster dive through the busdma allocation path..
Message-ID:  <CAJ-Vmo=wJ85Q3U9sLaV=FJ=8iSAWqBL-XanAUZZKVhsArhyn3g@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,


On a lark, I spent today doing a NUMA dumpster dive through the busdma
allocation path. The intent was to tag a dmat with a NUMA domain early
on (say in pci), so devices would inherit a domain tag when they
created tags, and busdma allocations could occur from that domain.

here's how it looks so far, dirty as it is:

* I've grabbed the vm phys first touch allocator stuff from -9, and
shoehorned it into -head;
* I've added iterators to the vm_phys code, so there's some concept of
configurable policies;
* and there's an iterator init function that can take a specific
domain, rather than PCPU_GET(domain).

That works enough for first-touch and round-robin userland page
allocation. it'd be easy to extend it so each proc/thread had a NUMA
allocation policy, but I haven't done that. I've done memory bandwidth
/ math benchmarks and abused pcm.x to check if the allocations are
working and indeed the first-touch allocator is doing what is
expected.

But, I'm much more interested in device allocation in the kernel for
${WORK}. So I wanted to give that a whirl and see what the minimum
amount of work to support that is.

* the vm_phys routines now have _domain() versions that take a domain
id, or -1 for "system/thread default";
* the kmem_alloc routines now have _domain() versions that take a
domain id, or -1;
* malloc() has a domain() version that takes a domain id or -1;
* busdma for x86 has a 'domain' tag that I'm populating as part of
PCI, based on bus_get_domain(). That's just a total hack, but hey, it
worked enough for testing;

I've plumbed the domain id down through uma enough for large page
allocation, and that all worked fine.

However, I hit a roadblock here:

t5nex0: alloc_ring: numa domain: 1; alloc len: 65536
bounce_bus_dmamem_alloc: dmat domain: 1
bounce_bus_dmamem_alloc:  kmem_alloc_contig_domain
vm_page_alloc_contig_domain: called; domain=1

.. so that's okay.

then vm_page_alloc_contig_domain() calls vm_reserv_alloc_contig(), and
that's returning an existing region. So vm_page_alloc_contig_domain()
never gets to call vm_phys_alloc_contig_domain() to get the physical
memory itself.

That's where I'm stuck. Right now it seems that since there's no
domain awareness in the vm_reserv code, it's just returning an
existing region on whatever domain that particular region came from.

So, I'm done with the dive for now. It looks like the VM reservation
code may need to learn about the existence of domains? Or would it be
cleaner/possible to have multiple kmem_object / kernel_object's, one
per domain?

Thanks,


-adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=wJ85Q3U9sLaV=FJ=8iSAWqBL-XanAUZZKVhsArhyn3g>