From owner-freebsd-net@freebsd.org Mon Jan 1 21:05:55 2018 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2FAACE810F5 for ; Mon, 1 Jan 2018 21:05:55 +0000 (UTC) (envelope-from v.maffione@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 091747E781 for ; Mon, 1 Jan 2018 21:05:55 +0000 (UTC) (envelope-from v.maffione@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 05785E810F3; Mon, 1 Jan 2018 21:05:55 +0000 (UTC) Delivered-To: net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 050FFE810F2 for ; Mon, 1 Jan 2018 21:05:55 +0000 (UTC) (envelope-from v.maffione@gmail.com) Received: from mail-qk0-x22c.google.com (mail-qk0-x22c.google.com [IPv6:2607:f8b0:400d:c09::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BC9EC7E780 for ; Mon, 1 Jan 2018 21:05:54 +0000 (UTC) (envelope-from v.maffione@gmail.com) Received: by mail-qk0-x22c.google.com with SMTP id o126so47218370qke.12 for ; Mon, 01 Jan 2018 13:05:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=NOabmQldPtUeN1XQfoYpS4s5R6EEySs4NyuTd3b9bGg=; b=PdnZo2/dXFWPaV91DxAYGUsCZKPcPYM7LhQMpw1Xqx1oHZ88iGtWdZAz+S+Fl9mgoD RptNYjRmL75U/kWRC654vSaUwyZBbszMnTHyhtjl2laH8PgC4k0u9CnzFGprmx4yxc4k 9+drcFza9MgzFxZBM69caGWj1DGYh8DxXvuE4c844PUJMcQqNiKzfnrDobDwcl63acwT sZ4yAJIvIpDiqlFFbR7nfshIVzSm7x3QtWfHicnG53NW4vByVfeXcPza3fio2dwi6paB I0G3NqT1ftAJSRSNiL+mhkSKp6gCMgZ+sasjoKutjyWY8qI6M5fUOu8mg4D6tURJPlea XpXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=NOabmQldPtUeN1XQfoYpS4s5R6EEySs4NyuTd3b9bGg=; b=DLthlFgHcvCEF2x5okE5znFUPY+e/yqzfXt/KR8pI00Gr8CJZwTyYxVivNtC6WEPjp qtjlz0kX1mXtRr5psrbzxPFCj/17nBEG1lwvqD8tj9+Y7zKotSwYyok5qcP1CS+4Rsb5 FdbQ/wP1jWzzg8JLwbIshxYZwuOiP6NMulhEj/TgKsJ3PHKTfMfxxlNp42Ev6nF3eL9W JR9GRqkC4lOjeoLJsXh3hTQ/2LgCmFYTEKaSZoJ4zMu8LsOqctJcx7qE36p0rEwXjucm WXPhwx47TLJ7YOXP8EGPdyh35xwxBKTODp61EL4nz2XRpYdUiqbL5frIuh6SErcl8IgV 7mIw== X-Gm-Message-State: AKGB3mKWAF9u+ZF4RAlTWyCznjygbqWzqzTOeXYRP52hnHTiO2WxzAeJ yv+RlwQ/gx4h5K6CYELBs7LJuT8Uj+rBmIWJF+OvAA== X-Google-Smtp-Source: ACJfBovm/Y1Yad+m9sZfLc9a88uMfQxFEdGQ5K+bP6up1S/m0Umh7v7HoW/UbkPyTMCZxf7iCccib0RX744QevOYUcc= X-Received: by 10.55.33.170 with SMTP id f42mr54124289qki.138.1514840753606; Mon, 01 Jan 2018 13:05:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.12.174.5 with HTTP; Mon, 1 Jan 2018 13:05:53 -0800 (PST) In-Reply-To: <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media> References: <7b85fc73-9cc8-0a60-5264-d26f47af5eae@atech.media> <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media> From: Vincenzo Maffione Date: Mon, 1 Jan 2018 22:05:53 +0100 Message-ID: Subject: Re: Linux netmap memory allocation To: Charlie Smurthwaite Cc: "freebsd-net@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Jan 2018 21:05:55 -0000 2018-01-01 17:14 GMT+01:00 Charlie Smurthwaite : > Hi, > > Thank you for your reply. I was able to resolve this. > > 1) I do indeed open one FD per NIC > 2) I no longer specify nr_arg1, nr_arg2 nor nr_arg3. Instead I just verify > that all NICs return with identical nr_arg2 so that the memory is shared > between them. > 3) I properly initialized my memory, my failure to do so was causing me a > lot of confusion, > > The resulting memory space is large enough for all the NICs, and > everything works perfectly with zero-copy forwarding, great! > > The only thing I am still having trouble with is the ability to > simultaneously trigger a TX and an RX sync on all NICs. I have tried > select, poll, and epoll, and in all cases, RX rings are updated but TX > rings are not and TX packets are not pushed out (this occurs using both > native and emulated netmap modes). I notice the documentation says "Note > that on epoll and kqueue, NETMAP_NO_TX_POLL and NETMAP_DO_RX_POLL only have > an effect when some event is posted for the file descriptor.", but the > behaviour seems the same on poll and select as well as epoll, perhaps this > is a linux-specific implementation detail? > I have also found that all of these mechanisms seem to incur a very high > cost in terms of CPU time (making them no more efficient than busy waiting > at 1Mpps+). My current approach is as follows, but I feel like there should > be a better option: > > for(int n=0; n // usleep(10); // More CPU time seems to be saved with a careful > sleep than with select/poll/epoll > ioctl(fds[n], NIOCTXSYNC); > ioctl(fds[n], NIOCRXSYNC); > rxring = rxrings[n]; > while (!nm_ring_empty(rxring)) { > // Forward any packets waiting in this NIC's RX ring to the > appropriate TX ring > } > } > If you are using poll() or select() you should not use ioctl(NIOC*XSYNC), as the txsync/rxsync operations are automatically performed within the poll()/select() syscall (at least assuming you did not specify NETMAP_NO_TX_POLL). Also, whether netmap calls or does not call txsync/rxsync on certain rings depends on the parameters passed to nm_open(). Make sure you check for nm_ring_space(txring) when forwarding. Cheers, Vincenzo > Thanks again, > > Charlie > > > On 01/01/18 15:40, Vincenzo Maffione wrote: > > Hi, > If you have 32 NICs you should open 32 netmap file descriptors, (and you > should not specify 64 in nr_arg1 or 256 in nr_arg3, this is for different > usecases). Also, since you want to do zercopy you must not specify a > separate memory area (nr_arg2), but use the same one. > You may want to use the high level API nm_open() > https://github.com/luigirizzo/netmap/blob/master/sys/net/ > netmap_user.h#L307 > > You may also want to look at the netmap tutorial to get a better idea of > how the API works (https://github.com/vmaffione/netmap-tutorial). > > Cheers, > Vincenzo > > 2017-12-28 18:34 GMT+01:00 Charlie Smurthwaite : > >> Hi, >> >> I'm just starting to use netmap and it is my intention to do zero-copy >> forwarding of frames between a large number of NICs. I am using Intel >> i350 (igb) on Linux. I therefore require a large memory area for rings >> and buffers. >> >> My calculation: >> 32 NICs * 2 rings (TX+RX) * 256 frames * 2048 bytes = 32MB >> >> I am currently having a problem (or perhaps just a misunderstanding) >> regarding allocation of this memory. I am attempting to use the >> following code: >> >> void thread_main(int thread_id) { >> struct nmreq req; // A struct for the netmap request >> int fd; // File descriptor for netmap socket >> void * mem; // Pointer to allocated memory area >> >> fd = open("/dev/netmap", 0); // Open a generic netmap socket >> strcpy(req.nr_name, "enp8s0f0"); // Copy NIC name into request >> req.nr_version = NETMAP_API; // Set version number >> req.nr_flags = NR_REG_ONE_NIC; // We will be using a single hw ring >> >> // Select ring 0, disable TX on poll >> req.nr_ringid = NETMAP_NO_TX_POLL | NETMAP_HW_RING | 0; >> >> // Ask for 64 additional rings to be allocated (32 * (TX+RX)) >> req.nr_arg1 = 64; >> >> // Allocate a separate memory area for each thread >> req.nr_arg2 = 10 + thread_id; >> >> // Ask for additional buffers (256 per ring) >> req.nr_arg3 = 64*256; >> >> // Initialize port >> ioctl(fd, NIOCREGIF, &req); >> >> // Check the allocated memory size >> printf("memsize: %u\n", req.nr_memsize); >> // Check the allocated memory area >> printf("nr_arg2: %u\n", req.nr_arg2); >> } >> >> The output is as follows: >> >> memsize: 4206859 >> nr_arg2: 10 >> >> This is far short of the amount of memory I am hoping to be allocated. >> Am I doing something wrong, or is this simply an indication that the >> driver is unwilling to allocate more than 4MB? >> >> A secondary (related) problem is that if I don't set arg1,arg2,arg3 in >> my code (ie they will be zero), then I get varying output (it varies >> between each of the following): >> >> memsize: 4206843 >> nr_arg2: 0 >> >> memsize: 343019520 >> nr_arg2: 1 >> >> Any pointers would be appreciated. Thanks! >> >> Charlie >> >> >> Charlie Smurthwaite >> Technical Director >> >> tel. email. charlie@atech.media web. >> https://atech.media >> >> This e-mail has been sent by aTech Media Limited (or one of its >> assoicated group companys, Dial 9 Communications Limited or Viaduct Hosting >> Limited). Its contents are confidential therefore if you have received this >> message in error, we would appreciate it if you could let us know and >> delete the message. aTech Media Limited is a UK limited company, >> registration number 5523199. Dial 9 Communications Limited is a UK limited >> company, registration number 7740921. Viaduct Hosting Limited is a UK >> limited company, registration number 8514362. All companies are registered >> at Unit 9 Winchester Place, North Street, Poole, Dorset, BH15 1NX. >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > > > > -- > Vincenzo Maffione > > > -- Vincenzo Maffione