Date: Mon, 1 Jan 2018 22:05:53 +0100 From: Vincenzo Maffione <v.maffione@gmail.com> To: Charlie Smurthwaite <charlie@atech.media> Cc: "freebsd-net@freebsd.org" <net@freebsd.org> Subject: Re: Linux netmap memory allocation Message-ID: <CA%2B_eA9hxQuej8L3SdY%2BhgpnDH3tccgsqOBtw1S=RkvURxu=Ktg@mail.gmail.com> In-Reply-To: <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media> References: <7b85fc73-9cc8-0a60-5264-d26f47af5eae@atech.media> <CA%2B_eA9hthoig%2B_UZQNZhM-aBndM44f0wz-NKqWUoYpBA8Ss0jQ@mail.gmail.com> <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media>
next in thread | previous in thread | raw e-mail | index | archive | help
2018-01-01 17:14 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>: > Hi, > > Thank you for your reply. I was able to resolve this. > > 1) I do indeed open one FD per NIC > 2) I no longer specify nr_arg1, nr_arg2 nor nr_arg3. Instead I just verify > that all NICs return with identical nr_arg2 so that the memory is shared > between them. > 3) I properly initialized my memory, my failure to do so was causing me a > lot of confusion, > > The resulting memory space is large enough for all the NICs, and > everything works perfectly with zero-copy forwarding, great! > > The only thing I am still having trouble with is the ability to > simultaneously trigger a TX and an RX sync on all NICs. I have tried > select, poll, and epoll, and in all cases, RX rings are updated but TX > rings are not and TX packets are not pushed out (this occurs using both > native and emulated netmap modes). I notice the documentation says "Note > that on epoll and kqueue, NETMAP_NO_TX_POLL and NETMAP_DO_RX_POLL only have > an effect when some event is posted for the file descriptor.", but the > behaviour seems the same on poll and select as well as epoll, perhaps this > is a linux-specific implementation detail? > I have also found that all of these mechanisms seem to incur a very high > cost in terms of CPU time (making them no more efficient than busy waiting > at 1Mpps+). My current approach is as follows, but I feel like there should > be a better option: > > for(int n=0; n<NIC_COUNT; n++) { > // usleep(10); // More CPU time seems to be saved with a careful > sleep than with select/poll/epoll > ioctl(fds[n], NIOCTXSYNC); > ioctl(fds[n], NIOCRXSYNC); > rxring = rxrings[n]; > while (!nm_ring_empty(rxring)) { > // Forward any packets waiting in this NIC's RX ring to the > appropriate TX ring > } > } > If you are using poll() or select() you should not use ioctl(NIOC*XSYNC), as the txsync/rxsync operations are automatically performed within the poll()/select() syscall (at least assuming you did not specify NETMAP_NO_TX_POLL). Also, whether netmap calls or does not call txsync/rxsync on certain rings depends on the parameters passed to nm_open(). Make sure you check for nm_ring_space(txring) when forwarding. Cheers, Vincenzo > Thanks again, > > Charlie > > > On 01/01/18 15:40, Vincenzo Maffione wrote: > > Hi, > If you have 32 NICs you should open 32 netmap file descriptors, (and you > should not specify 64 in nr_arg1 or 256 in nr_arg3, this is for different > usecases). Also, since you want to do zercopy you must not specify a > separate memory area (nr_arg2), but use the same one. > You may want to use the high level API nm_open() > https://github.com/luigirizzo/netmap/blob/master/sys/net/ > netmap_user.h#L307 > > You may also want to look at the netmap tutorial to get a better idea of > how the API works (https://github.com/vmaffione/netmap-tutorial). > > Cheers, > Vincenzo > > 2017-12-28 18:34 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>: > >> Hi, >> >> I'm just starting to use netmap and it is my intention to do zero-copy >> forwarding of frames between a large number of NICs. I am using Intel >> i350 (igb) on Linux. I therefore require a large memory area for rings >> and buffers. >> >> My calculation: >> 32 NICs * 2 rings (TX+RX) * 256 frames * 2048 bytes = 32MB >> >> I am currently having a problem (or perhaps just a misunderstanding) >> regarding allocation of this memory. I am attempting to use the >> following code: >> >> void thread_main(int thread_id) { >> struct nmreq req; // A struct for the netmap request >> int fd; // File descriptor for netmap socket >> void * mem; // Pointer to allocated memory area >> >> fd = open("/dev/netmap", 0); // Open a generic netmap socket >> strcpy(req.nr_name, "enp8s0f0"); // Copy NIC name into request >> req.nr_version = NETMAP_API; // Set version number >> req.nr_flags = NR_REG_ONE_NIC; // We will be using a single hw ring >> >> // Select ring 0, disable TX on poll >> req.nr_ringid = NETMAP_NO_TX_POLL | NETMAP_HW_RING | 0; >> >> // Ask for 64 additional rings to be allocated (32 * (TX+RX)) >> req.nr_arg1 = 64; >> >> // Allocate a separate memory area for each thread >> req.nr_arg2 = 10 + thread_id; >> >> // Ask for additional buffers (256 per ring) >> req.nr_arg3 = 64*256; >> >> // Initialize port >> ioctl(fd, NIOCREGIF, &req); >> >> // Check the allocated memory size >> printf("memsize: %u\n", req.nr_memsize); >> // Check the allocated memory area >> printf("nr_arg2: %u\n", req.nr_arg2); >> } >> >> The output is as follows: >> >> memsize: 4206859 >> nr_arg2: 10 >> >> This is far short of the amount of memory I am hoping to be allocated. >> Am I doing something wrong, or is this simply an indication that the >> driver is unwilling to allocate more than 4MB? >> >> A secondary (related) problem is that if I don't set arg1,arg2,arg3 in >> my code (ie they will be zero), then I get varying output (it varies >> between each of the following): >> >> memsize: 4206843 >> nr_arg2: 0 >> >> memsize: 343019520 >> nr_arg2: 1 >> >> Any pointers would be appreciated. Thanks! >> >> Charlie >> >> >> Charlie Smurthwaite >> Technical Director >> >> tel. email. charlie@atech.media<mailto:charlie@atech.media> web. >> https://atech.media >> >> This e-mail has been sent by aTech Media Limited (or one of its >> assoicated group companys, Dial 9 Communications Limited or Viaduct Hosting >> Limited). Its contents are confidential therefore if you have received this >> message in error, we would appreciate it if you could let us know and >> delete the message. aTech Media Limited is a UK limited company, >> registration number 5523199. Dial 9 Communications Limited is a UK limited >> company, registration number 7740921. Viaduct Hosting Limited is a UK >> limited company, registration number 8514362. All companies are registered >> at Unit 9 Winchester Place, North Street, Poole, Dorset, BH15 1NX. >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > > > > -- > Vincenzo Maffione > > > -- Vincenzo Maffione
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B_eA9hxQuej8L3SdY%2BhgpnDH3tccgsqOBtw1S=RkvURxu=Ktg>