Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Jul 2024 14:47:16 -0500
From:      Jake Freeland <jake@technologyfriends.net>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: FreeBSD hugepages
Message-ID:  <35da66f9-b913-45ea-90f4-16a2fa072848@technologyfriends.net>
In-Reply-To: <ZqKhP0aR0fb_f6XE@kib.kiev.ua>
References:  <1ced4290-4a31-4218-8611-63a44c307e87@technologyfriends.net> <ZqKhP0aR0fb_f6XE@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 7/25/24 14:02, Konstantin Belousov wrote:
> On Thu, Jul 25, 2024 at 01:46:17PM -0500, Jake Freeland wrote:
>> Hi there,
>>
>> I have been steadily working on bringing Data Plane Development Kit (DPDK)
>> on FreeBSD up to date with the Linux version. The most significant hurdle so
>> far has been supporting concurrent DPDK processes, each with their own
>> contiguous memory regions.
>>
>> These contiguous regions are used by DPDK as a heap for allocating DMA
>> buffers and other miscellaneous resources. Retrieving the underlying memory
>> and mapping these regions is currently different on Linux and FreeBSD:
>>
>> On Linux, hugepages are fetched from the kernel's pre-allocated hugepage
>> pool and are mapped into virtual address space on DPDK initialization. Since
>> the hugepages exist in a pool, multiple processes can reserve their own
>> hugepages and operate concurrently.
>>
>> On FreeBSD, DPDK uses an in-house contigmem kernel module that reserves a
>> large contiguous region of memory on load. During DPDK initialization, the
>> entire region is mapped into virtual address space. This leaves no memory
>> for another independent DPDK process, so only one process can operate at a
>> time.
>>
>> I could modify the DPDK contigmem module to mimic Linux's hugepages, but I
>> thought it would be better to integrate and upstream a hugepage-like
>> interface directly in the FreeBSD kernel source. I am writing this email to
>> see if anyone has any advice on the matter. I did not see any previous
>> attempts at this in Phabriactor or the commit log, but it is possible that I
>> missed it. I have read about transparent superpage promotion, but that seems
>> like a different mechanism altogether.
>>
>> At a quick glance, the implementation seems straightforward: read some
>> loader tunables, allocate persistent hugepages at boot time, and create a
>> pseudo filesystem that supports creating and mapping hugepages. I could be
>> underestimating the magnitude of this task, but that is why I'm asking for
>> thoughts and advice :)
>>
>> For reference, here is Linux's documentation on hugepages:
>> https://docs.kernel.org/admin-guide/mm/hugetlbpage.html
> Are posix shm largepages objects enough (they were developed to support
> DPDK).  Look for shm_create_largepage(3).
Yes, shm_create_largepage(2) looks promising, but I would like the 
ability to allocate these largepages at boot time when memory 
fragmentation as at a minimum. Perhaps a couple sysctl tunables could be 
added onto the vm.largepages node to specify a pagesize and allocate 
some number of pages at boot?

It seems Linux had an interface similar to shm_create_largepage(2) back 
in v2.5, but they removed it in favor of their hugetlbfs filesystem. It 
would be nice to stay close to the file-backed Linux interface to 
maximize code sharing in userspace. It looks like the foundation for 
hugepages is there, but the interface for allocation and access needs to 
be extended.

Jake Freeland



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?35da66f9-b913-45ea-90f4-16a2fa072848>