From owner-freebsd-net@freebsd.org Mon Nov 27 08:33:07 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E0FF4DFC2A3 for ; Mon, 27 Nov 2017 08:33:07 +0000 (UTC) (envelope-from v.maffione@gmail.com) Received: from mail-qk0-x22b.google.com (mail-qk0-x22b.google.com [IPv6:2607:f8b0:400d:c09::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 88D3169062 for ; Mon, 27 Nov 2017 08:33:07 +0000 (UTC) (envelope-from v.maffione@gmail.com) Received: by mail-qk0-x22b.google.com with SMTP id w125so31368343qkb.6 for ; Mon, 27 Nov 2017 00:33:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=pyrfb4ZDA/Rt0YB+rR199JLRt5G3LEDH95yC6INYj0c=; b=a37KlsQTDhJxrZV4oOd2PHMRPpMRXEm3ezLu5A6cALl+gsX3nv8e1lQaUDUdaq5qAE Nx2IyoIs2fNY5/f1zeuTobYCpzlGGwBWlBOwTh+IYowLl5kot+o/ACwZRqTvyEE8+IS2 8FM5kWwFwoQjnlNtxsTyAVnRUKqMPgc/3sI4spEWqzfKndRpcoPjR//ECnLX45i6Lo9E yMpaTDKip5uKDWIPMmBwBTa5e+5bAA3khalo+NdYaxrZF3Hiu6INhJMoqrbJnd0gsRoW PDDdiGOxbMzHPOoEaL9TCxDMrzpVjQ9v5qr68A9pZMJkn/yl2iaO3U047FFsQV3WQ42s mEfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=pyrfb4ZDA/Rt0YB+rR199JLRt5G3LEDH95yC6INYj0c=; b=sXNAiI1cKVjxdSFTWRG2jtIvZWQQZGQKs501cDsab7O57odlQA+m4I4ssiilyiAQCY 1wqu40nPoWqBFIvGkGsVT3sxiyEpuuAYiBSMNY5IGixWhwtXjkGKzSCEwfo/6KVSFO1A 2RTp9p/F9T0m+YTfKJ9TaAeoULsNyoKjhiY5Z1orzejQ9Oc7/iMkMmMn+WBGBaCmGiMq uVeQ4wVvPhgCfNxc86f1gVzYI36Zh+jxJAkQSQo9Hq65dsirZNjkEUFFKO40nUwpgUnV JxLTXvJb06UquWXXasLUtyYBvsvvNQB0qThIRGFJYaEd6yQTP8mT+2yAr5kMNXtm98g5 A/xA== X-Gm-Message-State: AJaThX4FJOuoM92M4EhTPOvXj1fauanMG6VffGeypNWifXhKsUELuukC BgASKVdCtZhK/Kdrv5qYiJO09jnqsUkdomWt47o= X-Google-Smtp-Source: AGs4zMaOp9M2wC44v3wgLyY5N5iL8ETPznmmxNpUo1RhZT4mPlXJWDoWRyQGM1T2l8RKc4K0k3hP9fr9zFTHNaw3Mpw= X-Received: by 10.55.152.129 with SMTP id a123mr25330305qke.142.1511771586459; Mon, 27 Nov 2017 00:33:06 -0800 (PST) MIME-Version: 1.0 Received: by 10.12.174.5 with HTTP; Mon, 27 Nov 2017 00:33:06 -0800 (PST) In-Reply-To: References: From: Vincenzo Maffione Date: Mon, 27 Nov 2017 09:33:06 +0100 Message-ID: Subject: Re: swaping ring slots between NIC ring and Host ring does not always success To: Xiaoye Sun Cc: Luigi Rizzo , "freebsd-net@freebsd.org" , Victor Detoni , Pavel Odintsov , Giuseppe Lettieri Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Nov 2017 08:33:08 -0000 Hi, If you think it's a bug can you please open an issue on the github ( https://github.com/luigirizzo/netmap/issues)? 2017-11-24 22:11 GMT+01:00 Xiaoye Sun : > Hi Vincenzo, > > Let me clarify my problem. (please ignore the previous incompleted email) > > I have a program, which is an extension of bridge.c > https://github.com/luigirizzo/netmap/blob/master/apps/bridge/bridge.c > The only difference is that my program also generates customized packets > sent to the NIC directly. > These customized packets have increasing sequence numbers. > So, this program not only sends these customized packets but also forwards > packets between NIC and host stack using zerocopy. > The program only takes one NIC queue and there is only one thread. > > I think the problem is that there is a chance where netmap does not update > the pointer to the buffer even when NS_BUF_CHANGED is set (buf_idx is > changed). > Can you disable zerocopy in bridge.c to see if the problem goes away? This would be an useful information. > > Let's say the NIC tx ring has 4096 slots. The customized packet sequence > 16 is filled in the buffer of slot 2057. > The customized packets keep filling the slots until the next available > slot is 2056. > Do you mean that your program fills the TX ring slots 2057,2058...2054,2055 with custom packets? This would mean you filled all the available slots, since one slot is left empty. > Now the customised packet sequence 4111 is filled to 2056. > You cannot fill the slot 2056 if 2055 has not been NIOCTXYSINC'd. Aren't you using nm_ring_empty() and nm_ring_space() functions to check for available space in TX ring (assuming you update rinig->head/ring->cur before calling those functions)? Cheers, Vincenzo > Then the netmap program is notified that there is a packet from the host > stack sent to the NIC. > The netmap program swaps the buf_idx between slot 2057 and the > corresponding slot in the host rx ring and set the NS_BUF_CHANGED flag of > both slots. > Then the netmap program fills sequence 4112 to slot 2058. > However, the buffer swap seems not succeed so that the original content of > slot 2057 (sequence 16) is sent out. > So that at the receiver side, the receiver sees two sequence > 16s.(16,17...4110,4111,16,4112,4113). > > So think the root of the problem is that the buffer pointer is not always > successfully/timely updated even after the NS_BUF_CHANGED flag is set and > the buf_idx is updated. > > Best, > Xiaoye > > > > On Wed, Nov 22, 2017 at 7:39 AM, Vincenzo Maffione > wrote: > >> Hi, >> >> 2017-11-21 7:51 GMT+01:00 Xiaoye Sun : >> >>> Hi, >>> >>> Recently I found another problem with netmap. I think this new problem >>> could be related to the problems in this threads so I just post the new >>> problem here. >>> >>> In my setup, I have a sender program having a netmap ring (a pair of >>> RX/TX ring) for the NIC and a ring for the host stack. The sender program >>> puts customized packets (each packet has a unique sequence number and the >>> sender sends the packet in a sequence number increasing order) to the NIC >>> TX ring directly and also forwards the packets from the host RX ring to >>> the >>> NIC TX ring using "zerocopy" by swapping the buffer indices. >>> However, the receiver sees duplicated customized packets. For example, in >>> the case where the ring size is 32 (32 slots in a ring) the order of the >>> sequence numbers the receiver see is 1,2,3,4,5,...,68,69,*70* >>> ,71,72,73,...,99,100,*70*,101,102,103,... . An interesting thing I >>> found is >>> that the "gaps" between these two duplicated packets (70 in the example) >>> are always a number very close to the ring size, 32 in this example. In >>> my >>> experiment, I use a ring with 4096 slots and the gap is always more than >>> 4090 and close to 4096. I verified that this duplication happens due to >>> the >>> sender, not the receiver. Assuming my sender's implementation is correct, >>> then this duplication may happen in netmap and the NIC driver (ixgbe). >>> >> >> Netmap itself doesn't do any duplication nor takes a look at the packets. >> It just passes >> down ring->cur/ring->head to the ixgbe driver (after validation). >> The ixgbe driver datapath is bypassed and replaced with a netmap-enabled >> datapath (see https://github.com/luigirizzo/ >> netmap/blob/master/LINUX/ixgbe_netmap_linux.h#L294-L461); >> no duplication should happen there as each netmap slot (1 TX packet) is >> used >> only once. >> >>> >>> >>> Thinking back to the original problem in this post, I think these >>> problems >>> may be related. It seems to me that there could be multiple threads >>> pulling >>> the packets from the NIC TX ring (or the thread moved to other CPUs when >>> the problem occurs) and these threads may run on different cores so that >>> the outdated content in the buffer may be sent out when new content is >>> written to the buffer. >>> >>> >> There are no such threads pulling from the NIC TX ring. Your application >> directly >> puts new packets to be transmitted in the netmap buffers referenced in >> the netmap TX >> ring. When then you call NIOCTXSYNC or poll(), all the new TX buffers >> (e.g. all >> the ones from the previous value ring->head (included) to the new value >> of ring->head (excluded)) >> are moved to the NIC TX ring. This happens in the context of your >> application thread, >> no worker threads are used. Then the NIC hardware starts the transmission. >> >> >>> I am wondering if there is a way to pin the NIC driver of the netmap >>> module >>> to a specific core. or is there a way to know the root of such problem? >>> >> >> The only threads are the ones of your application. >> Maybe your problem comes from concurrent accesses to the netmap TX ring >> from different threads? Only one thread at a given time should update a >> netmap >> TX/RX ring. Otherwise the behaviour is unspecified. >> >> Cheers, >> Vincenzo >> >> >>> >>> Best, >>> Xiaoye >>> >>> >> -- >> Vincenzo Maffione >> > > -- Vincenzo Maffione