Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Jan 2018 16:14:41 +0000
From:      Charlie Smurthwaite <charlie@atech.media>
To:        Vincenzo Maffione <v.maffione@gmail.com>
Cc:        "freebsd-net@freebsd.org" <net@freebsd.org>
Subject:   Re: Linux netmap memory allocation
Message-ID:  <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media>
In-Reply-To: <CA%2B_eA9hthoig%2B_UZQNZhM-aBndM44f0wz-NKqWUoYpBA8Ss0jQ@mail.gmail.com>
References:  <7b85fc73-9cc8-0a60-5264-d26f47af5eae@atech.media> <CA%2B_eA9hthoig%2B_UZQNZhM-aBndM44f0wz-NKqWUoYpBA8Ss0jQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

Thank you for your reply. I was able to resolve this.

1) I do indeed open one FD per NIC
2) I no longer specify nr_arg1, nr_arg2 nor nr_arg3. Instead I just 
verify that all NICs return with identical nr_arg2 so that the memory is 
shared between them.
3) I properly initialized my memory, my failure to do so was causing me 
a lot of confusion,

The resulting memory space is large enough for all the NICs, and 
everything works perfectly with zero-copy forwarding, great!

The only thing I am still having trouble with is the ability to 
simultaneously trigger a TX and an RX sync on all NICs. I have tried 
select, poll, and epoll, and in all cases, RX rings are updated but TX 
rings are not and TX packets are not pushed out (this occurs using both 
native and emulated netmap modes). I notice the documentation says "Note 
that on epoll and kqueue, NETMAP_NO_TX_POLL and NETMAP_DO_RX_POLL only 
have an effect when some event is posted for the file descriptor.", but 
the behaviour seems the same on poll and select as well as epoll, 
perhaps this is a linux-specific implementation detail?

I have also found that all of these mechanisms seem to incur a very high 
cost in terms of CPU time (making them no more efficient than busy 
waiting at 1Mpps+). My current approach is as follows, but I feel like 
there should be a better option:

     for(int n=0; n<NIC_COUNT; n++) {
       // usleep(10); // More CPU time seems to be saved with a careful 
sleep than with select/poll/epoll
       ioctl(fds[n], NIOCTXSYNC);
       ioctl(fds[n], NIOCRXSYNC);
       rxring = rxrings[n];
       while (!nm_ring_empty(rxring)) {
         // Forward any packets waiting in this NIC's RX ring to the 
appropriate TX ring
       }
     }

Thanks again,

Charlie


On 01/01/18 15:40, Vincenzo Maffione wrote:
> Hi,
>   If you have 32 NICs you should open 32 netmap file descriptors, (and 
> you should not specify 64 in nr_arg1 or 256 in nr_arg3, this is for 
> different usecases). Also, since you want to do zercopy you must not 
> specify a separate memory area (nr_arg2), but use the same one.
> You may want to use the high level API nm_open() 
> https://github.com/luigirizzo/netmap/blob/master/sys/net/netmap_user.h#L307
>
> You may also want to look at the netmap tutorial to get a better idea 
> of how the API works (https://github.com/vmaffione/netmap-tutorial).
>
> Cheers,
>   Vincenzo
>
> 2017-12-28 18:34 GMT+01:00 Charlie Smurthwaite <charlie@atech.media 
> <mailto:charlie@atech.media>>:
>
>     Hi,
>
>     I'm just starting to use netmap and it is my intention to do zero-copy
>     forwarding of frames between a large number of NICs. I am using Intel
>     i350 (igb) on Linux. I therefore require a large memory area for rings
>     and buffers.
>
>     My calculation:
>     32 NICs * 2 rings (TX+RX) * 256 frames * 2048 bytes = 32MB
>
>     I am currently having a problem (or perhaps just a misunderstanding)
>     regarding allocation of this memory. I am attempting to use the
>     following code:
>
>     void thread_main(int thread_id) {
>       struct nmreq req; // A struct for the netmap request
>       int fd;           // File descriptor for netmap socket
>       void * mem;       // Pointer to allocated memory area
>
>       fd = open("/dev/netmap", 0);     // Open a generic netmap socket
>       strcpy(req.nr_name, "enp8s0f0"); // Copy NIC name into request
>       req.nr_version = NETMAP_API;     // Set version number
>       req.nr_flags = NR_REG_ONE_NIC;   // We will be using a single hw
>     ring
>
>       // Select ring 0, disable TX on poll
>       req.nr_ringid = NETMAP_NO_TX_POLL | NETMAP_HW_RING | 0;
>
>       // Ask for 64 additional rings to be allocated (32 * (TX+RX))
>       req.nr_arg1 = 64;
>
>       // Allocate a separate memory area for each thread
>       req.nr_arg2 = 10 + thread_id;
>
>       // Ask for additional buffers (256 per ring)
>       req.nr_arg3 = 64*256;
>
>       // Initialize port
>       ioctl(fd, NIOCREGIF, &req);
>
>       // Check the allocated memory size
>       printf("memsize: %u\n", req.nr_memsize);
>       // Check the allocated memory area
>       printf("nr_arg2: %u\n", req.nr_arg2);
>     }
>
>     The output is as follows:
>
>     memsize: 4206859
>     nr_arg2: 10
>
>     This is far short of the amount of memory I am hoping to be allocated.
>     Am I doing something wrong, or is this simply an indication that the
>     driver is unwilling to allocate more than 4MB?
>
>     A secondary (related) problem is that if I don't set arg1,arg2,arg3 in
>     my code (ie they will be zero), then I get varying output (it varies
>     between each of the following):
>
>     memsize: 4206843
>     nr_arg2: 0
>
>     memsize: 343019520
>     nr_arg2: 1
>
>     Any pointers would be appreciated. Thanks!
>
>     Charlie
>
>
>     Charlie Smurthwaite
>     Technical Director
>
>     tel.  email. charlie@atech.media<mailto:charlie@atech.media
>     <mailto:charlie@atech.media>> web. https://atech.media
>
>     This e-mail has been sent by aTech Media Limited (or one of its
>     assoicated group companys, Dial 9 Communications Limited or
>     Viaduct Hosting Limited). Its contents are confidential therefore
>     if you have received this message in error, we would appreciate it
>     if you could let us know and delete the message. aTech Media
>     Limited is a UK limited company, registration number 5523199. Dial
>     9 Communications Limited is a UK limited company, registration
>     number 7740921. Viaduct Hosting Limited is a UK limited company,
>     registration number 8514362. All companies are registered at Unit
>     9 Winchester Place, North Street, Poole, Dorset, BH15 1NX.
>     _______________________________________________
>     freebsd-net@freebsd.org <mailto:freebsd-net@freebsd.org> mailing list
>     https://lists.freebsd.org/mailman/listinfo/freebsd-net
>     <https://lists.freebsd.org/mailman/listinfo/freebsd-net>;
>     To unsubscribe, send any mail to
>     "freebsd-net-unsubscribe@freebsd.org
>     <mailto:freebsd-net-unsubscribe@freebsd.org>"
>
>
>
>
> -- 
> Vincenzo Maffione




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c>