Date: Wed, 3 Jan 2018 09:48:00 +0100 From: Vincenzo Maffione <v.maffione@gmail.com> To: Charlie Smurthwaite <charlie@atech.media> Cc: "freebsd-net@freebsd.org" <net@freebsd.org> Subject: Re: Linux netmap memory allocation Message-ID: <CA%2B_eA9g5HxE9VVFEsKW-yXAtr_8-_qSQMpyaRLNUy0zApOXydw@mail.gmail.com> In-Reply-To: <f3f94485-2f71-26d0-5a81-10e3166d3538@atech.media> References: <7b85fc73-9cc8-0a60-5264-d26f47af5eae@atech.media> <CA%2B_eA9hthoig%2B_UZQNZhM-aBndM44f0wz-NKqWUoYpBA8Ss0jQ@mail.gmail.com> <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media> <CA%2B_eA9hxQuej8L3SdY%2BhgpnDH3tccgsqOBtw1S=RkvURxu=Ktg@mail.gmail.com> <da1e5904-30c8-b06b-6e7f-0bf26fc99a17@atech.media> <CA%2B_eA9hs-GUCRH%2B5FAs1SPyR8S8GFndq_ScgDAmJ8njgOsQBCQ@mail.gmail.com> <f3f94485-2f71-26d0-5a81-10e3166d3538@atech.media>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Charlie, 2018-01-03 0:07 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>: > Hi Vincenzo, > > >> I am using poll(), and I am not specifying NETMAP_NO_TX_POLL, and have >> found that sometimes frames and sent only when the TX buffer is full, and >> sometimes they are not sent at all. They are never sent as expected on >> every invocation of poll(). If I run ioctl(NIOCTXSYNC) manually, everything >> works correctly. I assume I have simply missed something from my nmreq. >> > > I don't think you have missed anything within nmreq. I see that you are > waiting for POLLIN only (and this is right in your router case), so poll() > will actually invoke txsync on interface #i only when netmap intercepts an > RX or TX interrupt on interface #i. This means that packets may stall for > long time in the TX rings if you don't call ioctl(TXSYNC). The manual is > not wrong, however. You can look at the apps/bridge/bridge.c example to > understand where this "poll automatically calls txsync" thing is useful. > > Thank you for the clarification. I have now altered my code to call TXSYNC > after each iteration, but only if I have modified the TX ring for that > interface. This seems to work perfectly. The patch can be seen at > https://github.com/catphish/netmap-router/commit/ > 2961ab16f14a8b2a2561c9d73f73857e523cc177 > I see, it looks good. > > > >> You also mentioned: "whether netmap calls or does not call txsync/rxsync >> on certain rings depends on the parameters passed to nm_open()". I do not >> use the nm_open helper method, but I am extremely interested to know what >> parameters would affect this bahaviour, as this would seem very relevant to >> my problem. >> > > Yes, we do not normally use the low level interface (ioctl(REGIF)), > because it's just simpler to use the nm_open() interface. Within the first > parameter of nm_open() you can specify to open just one RX/TX rings couple, > e.g. with "enp1f0s1-3". Then you usually want to mmap() just once (as you > do in your program); with nm_open(), you do that with the NM_OPEN_NO_MMAP > flag. > > I did look at nm_open, and even read the source of nm_open to discover how > to implement the shared memory, but (for no good reason) I preferred to set > up the interface manually. > That's ok. > >> If you are interested or if it helps explain my question, my complete >> code (hopefully well commented but far from complete) can be found here: >> https://github.com/catphish/netmap-router/blob/58a9b957c19b0 >> a012088c491bd58bc3161a56ff1/router.c >> >> Specifically, if the ioctl call at line 92 is removed, the code does not >> work (packets are not transmitted, or are only transmitted when the buffer >> is full, which of these 2 behaviours seems to be random), however I would >> expect it to work because I do not specify NETMAP_NO_TX_POLL, and I would >> therefore hope that the poll() call on line 80 would have the same effect. >> > > Yes, that depends on when netmap_poll() is called by the kernel, that > depends on when something is ready for receive on the file descriptor. > Looking at your program, I think you need to call ioctl(TXSYNC), at least > because you don't want to introduce artificial/unbounded latency. However, > since these calls are expensive, you could use them only when necessary > (e.g. when you nm_ring_space(txring) == 0 or when you actually forwarded > some packets on txring. > > Per the patch above I now call TXSYNC on an interface only after pushing a > batch of packets to it and this seems to work perfectly, at least with a > good balance between performance and latency. If nm_ring_space(txring) == 0 > I just drop frames until the next batch. I don't TXSYNC part way through a > batch, it hasn't yet seemed necessary, but I may need to look into this > later. > Right, there are some heuristics you can try. Calling TXSYNC if you find nm_ring_space(txring) == 0 while forwarding is a common one, as you suggest. It can be beneficial or not, depending on your machine, NIC and workload, so one should just try. > > I'm running this on a 6-core 2.8GHz Xeon with a 4-port i350-T4 NIC. I > thought I'd just post some stats of the performance I observe using my code > (excluding the routing table lookup as this isn't relevant to netmap). Not > really looking for any advice here, just thought I'd share my results. > > All examples are with 1.488Mpps (1 x 1Gbps) input and no packet loss > observed: > 1 thread - CPU usage = 100%, batch size = 4 > 2 thread - CPU usage = 54% (27% x 2), batch size = 12 > 4 thread - CPU usage = 98% (25% x 4), batch size = 8 > 6 thread - CPU usage = 124% (21% x 6), batch size = 8 > > And again with 2.976Mpps (2 x 1Gbps) input and no packet loss observed: > 1 thread - CPU usage = 100%, batch size = 12 > 2 thread - CPU usage = 68% (34% x 2), batch size = 21 > 4 thread - CPU usage = 100% (25% x 4), batch size = 17 > 6 thread - CPU usage = 105% (18% x 6), batch size = 16 > > These results seem excellent and demonstrate that netmap is scaling as > expected with both threads and packet volume. The higher thread count will > be more beneficial when I am doing more processing on each packet. > Yes, as you can see the batch size is very beneficial to CPU utilization and packet rate, because poll/ioctl are kind of expensive. You could try to achieve higher batch to possibly better results. If you don't mind adding a controlled latency you could experiment with adding something like "usleep(30)" in your forwarding loop: this should lead to larger batches. > > >> I hope this all makes sense, and again, I hope I have simply missed >> something from the nmreq i pass to NIOCREGIF. >> >> It is worth mentioning that with the exception of this problem / >> confusion, I am getting extremely good results from this code and netmap in >> general. >> > > That's nice to hear :) > Your program looks simple enough that we could even add it to the examples > (as an example of routing logic). > > I'd be very happy to contribute to the documentation in any way that may > be helpful. I have added a permissive licence to my Github repository just > in case my code of of use to anyone else. It is currently somewhat > incomplete as an IPv4 router as it doesn't update MAC addresses on frames > before forwarding them, and because the interface names are hardcoded, but > when it's more complete I'd be very happy for it to be contributed to the > examples. Of course anyone is free to use my code for any purpose too. > > Thanks for all your assistance! I'm happy enough with this that I will > move on to looking at my IP routing code. > Ok, thanks! Vincenzo > > Charlie > > > > *Charlie Smurthwaite* > Technical Director > > *tel.* *email.* charlie@atech.media *web.* https://atech.media > > *This e-mail has been sent by aTech Media Limited (or one of its > assoicated group companys, Dial 9 Communications Limited or Viaduct Hosting > Limited). Its contents are confidential therefore if you have received this > message in error, we would appreciate it if you could let us know and > delete the message. aTech Media Limited is a UK limited company, > registration number 5523199. Dial 9 Communications Limited is a UK limited > company, registration number 7740921. Viaduct Hosting Limited is a UK > limited company, registration number 8514362. All companies are registered > at Unit 9 Winchester Place, North Street, Poole, Dorset, BH15 1NX.* > -- Vincenzo Maffione
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B_eA9g5HxE9VVFEsKW-yXAtr_8-_qSQMpyaRLNUy0zApOXydw>