From owner-freebsd-net@freebsd.org Thu Nov 30 03:55:04 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B4243DF532C for ; Thu, 30 Nov 2017 03:55:04 +0000 (UTC) (envelope-from sunxiaoye07@gmail.com) Received: from mail-ot0-x22d.google.com (mail-ot0-x22d.google.com [IPv6:2607:f8b0:4003:c0f::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 636057259B for ; Thu, 30 Nov 2017 03:55:04 +0000 (UTC) (envelope-from sunxiaoye07@gmail.com) Received: by mail-ot0-x22d.google.com with SMTP id v21so5042624oth.6 for ; Wed, 29 Nov 2017 19:55:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=sbVQnWGqA9ODC+zxD8O9tjbft2BxPhkANlWbSmYn3mo=; b=JWkJ85hr3gLIB7JkAlVdxMJOwfLXitAKtlNdxWrRzDpwKnRw81bgPKtHFbSDSub3j+ CPwMUIV2scqp9HjRfE2ZM8N+jpv8hyBa8n9RHYfH864djz3gp7NxvCo8FFbfZ+hzl4b8 pE6XuOmtwPwtF2XII+k38671JUMYsgG01y6G90d36/LZMzwcko+m61IfCFk9BcQOiKU4 GBrlkb2W14TOR7zDPTEVLcNtNZ7H8HnpBEfoimqTEShwuBAD8l+HOoeHOBEpM/ZtoCLU J62hfNGp4zoMgOs8D8fV5AbSvVpdoDMH7L8C3azPsvPC66v79w+FdYUAywZC9ZRIk2xF N/Xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=sbVQnWGqA9ODC+zxD8O9tjbft2BxPhkANlWbSmYn3mo=; b=VqvEH2+Fi38CqslEdfUJ+KsvpxfhCfj6ENyxh11BrcXG3GL1wkGPYHxzuwmdbDU+o+ ibMhlbhmS6WijGYMbpH7SZuAXEbBnTz+JQluGPb27ide2MAUBa8ha2HaVt0vuqBJGzqT abtFLELzsPHh7F4k8k3MfK//tM7K/156Wiw+h/EoKmon8HjHgdsGTsylPrdGH7DtzZiJ any4bkFMJIXk6lcKjSl0yPwEF44sEpgmcwOS39kfV6ust5X+ZR7kaZfKOX/d5q+EvtBt nJdZtCnTtjyuMSk97iE8wuHP6MNcViOHbv/ce+2tj3ntj0HQsktjqTEnYeVLF1vBaxp0 GCfw== X-Gm-Message-State: AJaThX5WU13lZi7A5c0VuwDnp5jZXnOQHDKj6tYunvda1aawrDHgpgFD TVnhOtbscS71G6yKhBxzgVlbNapKz4I3eC2Di00= X-Google-Smtp-Source: AGs4zMbWH8AgGkiqjXMb4eGjbE8osUJn1Yf++toGf0MgiRbZ6Te1TZxgto3PJIqbaZGPJAXb9pafU6oajGQS2ZE3300= X-Received: by 10.157.27.44 with SMTP id l41mr3989531otl.372.1512014103580; Wed, 29 Nov 2017 19:55:03 -0800 (PST) MIME-Version: 1.0 Sender: sunxiaoye07@gmail.com Received: by 10.157.41.3 with HTTP; Wed, 29 Nov 2017 19:55:02 -0800 (PST) In-Reply-To: References: From: Xiaoye Sun Date: Wed, 29 Nov 2017 21:55:02 -0600 X-Google-Sender-Auth: KKm0J2kIdK84L7EID-0jdKfuerQ Message-ID: Subject: Re: swaping ring slots between NIC ring and Host ring does not always success To: Vincenzo Maffione Cc: Luigi Rizzo , "freebsd-net@freebsd.org" , Victor Detoni , Pavel Odintsov , Giuseppe Lettieri Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Nov 2017 03:55:04 -0000 On Mon, Nov 27, 2017 at 2:33 AM, Vincenzo Maffione wrote: > Hi, > > If you think it's a bug can you please open an issue on the github ( > https://github.com/luigirizzo/netmap/issues)? > > 2017-11-24 22:11 GMT+01:00 Xiaoye Sun : > >> Hi Vincenzo, >> >> Let me clarify my problem. (please ignore the previous incompleted email) >> >> I have a program, which is an extension of bridge.c >> https://github.com/luigirizzo/netmap/blob/master/apps/bridge/bridge.c >> The only difference is that my program also generates customized packets >> sent to the NIC directly. >> These customized packets have increasing sequence numbers. >> So, this program not only sends these customized packets but also >> forwards packets between NIC and host stack using zerocopy. >> The program only takes one NIC queue and there is only one thread. >> >> I think the problem is that there is a chance where netmap does not >> update the pointer to the buffer even when NS_BUF_CHANGED is set >> (buf_idx is changed). >> > > Can you disable zerocopy in bridge.c to see if the problem goes away? This > would be an useful information. > > disable zerocopy make this problem disappeared > >> Let's say the NIC tx ring has 4096 slots. The customized packet sequence >> 16 is filled in the buffer of slot 2057. >> The customized packets keep filling the slots until the next available >> slot is 2056. >> > > Do you mean that your program fills the TX ring slots > 2057,2058...2054,2055 with custom packets? This would mean you filled all > the available slots, since one slot is left empty. > > >> Now the customised packet sequence 4111 is filled to 2056. >> > > You cannot fill the slot 2056 if 2055 has not been NIOCTXYSINC'd. Aren't > you using nm_ring_empty() and nm_ring_space() functions to check > for available space in TX ring (assuming you update rinig->head/ring->cur > before calling those functions)? > > the slots are not filled-in once. NIOCTXYSINC are called at most every 512 slots are filled. I always use nm_ring_space() to check the number of remaining slots in the ring. > Cheers, > Vincenzo > > >> Then the netmap program is notified that there is a packet from the host >> stack sent to the NIC. >> The netmap program swaps the buf_idx between slot 2057 and the >> corresponding slot in the host rx ring and set the NS_BUF_CHANGED flag >> of both slots. >> Then the netmap program fills sequence 4112 to slot 2058. >> However, the buffer swap seems not succeed so that the original content >> of slot 2057 (sequence 16) is sent out. >> So that at the receiver side, the receiver sees two sequence >> 16s.(16,17...4110,4111,16,4112,4113). >> >> So think the root of the problem is that the buffer pointer is not always >> successfully/timely updated even after the NS_BUF_CHANGED flag is set >> and the buf_idx is updated. >> >> Best, >> Xiaoye >> >> >> >> On Wed, Nov 22, 2017 at 7:39 AM, Vincenzo Maffione >> wrote: >> >>> Hi, >>> >>> 2017-11-21 7:51 GMT+01:00 Xiaoye Sun : >>> >>>> Hi, >>>> >>>> Recently I found another problem with netmap. I think this new problem >>>> could be related to the problems in this threads so I just post the new >>>> problem here. >>>> >>>> In my setup, I have a sender program having a netmap ring (a pair of >>>> RX/TX ring) for the NIC and a ring for the host stack. The sender >>>> program >>>> puts customized packets (each packet has a unique sequence number and >>>> the >>>> sender sends the packet in a sequence number increasing order) to the >>>> NIC >>>> TX ring directly and also forwards the packets from the host RX ring to >>>> the >>>> NIC TX ring using "zerocopy" by swapping the buffer indices. >>>> However, the receiver sees duplicated customized packets. For example, >>>> in >>>> the case where the ring size is 32 (32 slots in a ring) the order of the >>>> sequence numbers the receiver see is 1,2,3,4,5,...,68,69,*70* >>>> ,71,72,73,...,99,100,*70*,101,102,103,... . An interesting thing I >>>> found is >>>> that the "gaps" between these two duplicated packets (70 in the example) >>>> are always a number very close to the ring size, 32 in this example. In >>>> my >>>> experiment, I use a ring with 4096 slots and the gap is always more than >>>> 4090 and close to 4096. I verified that this duplication happens due to >>>> the >>>> sender, not the receiver. Assuming my sender's implementation is >>>> correct, >>>> then this duplication may happen in netmap and the NIC driver (ixgbe). >>>> >>> >>> Netmap itself doesn't do any duplication nor takes a look at the >>> packets. It just passes >>> down ring->cur/ring->head to the ixgbe driver (after validation). >>> The ixgbe driver datapath is bypassed and replaced with a netmap-enabled >>> datapath (see https://github.com/luigirizzo/ >>> netmap/blob/master/LINUX/ixgbe_netmap_linux.h#L294-L461); >>> no duplication should happen there as each netmap slot (1 TX packet) is >>> used >>> only once. >>> >>>> >>>> >>>> Thinking back to the original problem in this post, I think these >>>> problems >>>> may be related. It seems to me that there could be multiple threads >>>> pulling >>>> the packets from the NIC TX ring (or the thread moved to other CPUs when >>>> the problem occurs) and these threads may run on different cores so that >>>> the outdated content in the buffer may be sent out when new content is >>>> written to the buffer. >>>> >>>> >>> There are no such threads pulling from the NIC TX ring. Your application >>> directly >>> puts new packets to be transmitted in the netmap buffers referenced in >>> the netmap TX >>> ring. When then you call NIOCTXSYNC or poll(), all the new TX buffers >>> (e.g. all >>> the ones from the previous value ring->head (included) to the new value >>> of ring->head (excluded)) >>> are moved to the NIC TX ring. This happens in the context of your >>> application thread, >>> no worker threads are used. Then the NIC hardware starts the >>> transmission. >>> >>> >>>> I am wondering if there is a way to pin the NIC driver of the netmap >>>> module >>>> to a specific core. or is there a way to know the root of such problem? >>>> >>> >>> The only threads are the ones of your application. >>> Maybe your problem comes from concurrent accesses to the netmap TX ring >>> from different threads? Only one thread at a given time should update a >>> netmap >>> TX/RX ring. Otherwise the behaviour is unspecified. >>> >>> Cheers, >>> Vincenzo >>> >>> >>>> >>>> Best, >>>> Xiaoye >>>> >>>> >>> -- >>> Vincenzo Maffione >>> >> >> > > > -- > Vincenzo Maffione >