From owner-freebsd-net@freebsd.org Fri Nov 24 20:49:58 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D004FDF110A for ; Fri, 24 Nov 2017 20:49:58 +0000 (UTC) (envelope-from sunxiaoye07@gmail.com) Received: from mail-oi0-x236.google.com (mail-oi0-x236.google.com [IPv6:2607:f8b0:4003:c06::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 818A063F76 for ; Fri, 24 Nov 2017 20:49:58 +0000 (UTC) (envelope-from sunxiaoye07@gmail.com) Received: by mail-oi0-x236.google.com with SMTP id r190so15777670oie.6 for ; Fri, 24 Nov 2017 12:49:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=DDfG6IV7VjYkdScuU8YQsZUYPn+fVJNB3mC/zRi2+Po=; b=AwBZEYWhtDQVP8sLBWfqFoZGECJkuHSSBTFaXLz6eyCM6qBOfQeFhO/IYuLeGN8ejp 4Z0zysw1/Rv0Bhi2GQzjsYMvbLHiOm/TlZ/2Jsw9ZTjPpbEWbCKYZyJ/rsp1OizfROaX b1TJJWn6c1Yu7TY5EYrcURF5XImeWIKZvTp2GhysRaWo8Oaw8nGHkvy9zO14mMdhyVLp ZoaUzyoHPenoKGp7ygdk4ngRI80yQn+lSA6Ak2LPP4jxtcAdBxU960Xzmjg9OTSjNS+1 mfe4JGdLJNeBMjD5PLze63SJYsNtgUdBxFW8urQdD/M+wcFC+iCw5kgXVN5OUUFuuXO7 +0UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=DDfG6IV7VjYkdScuU8YQsZUYPn+fVJNB3mC/zRi2+Po=; b=hqnakljJfeQ9lunnBwG5eXi4/oXmkKihjADEuZOQ+tRtMZ/7gAaXeT/fMUuV1VOsoN j+hAwq5CcNQjceyKi3Rvj/2rWcg+2g82h0izmRr+G5g4fRMaf8CfCeBsH81FofUFx3/S iaweuJstOrfMoS/utQNVT81x78ZAgLTFjbPgv/A+n4j6U9+Gmc12dmAEde982CFJQ2e/ U8pw9HPoVMlp4ejh/UiF7thCNAmOccWJLHJA65qbay7QmLjYe0FKX9JGgZGmfmrrzISR Dch12hagOaNzK4BpHMKIiJiVDjFSHrCVfX834COndmUtMUSQyC1Pj+2HmwELMHSJiUYT w0zQ== X-Gm-Message-State: AJaThX7w8r6k2klP46dJ4Ogvl1VUI5oYX2U3trD9My+6SubPi+CFmkGy 0cz0qp2d4t/DjtOi/XTNTV3bdTfwmeQjZw07s1xVOQ== X-Google-Smtp-Source: AGs4zMaaogoSakKXbdZZTXgP16b+4yPdZ59NZ9QcyDcnTqLY1Ebgq70FyqZwmfICV4rDpJyoNu08kDYQQUKgH+RCLzU= X-Received: by 10.202.166.17 with SMTP id p17mr10492096oie.192.1511556597827; Fri, 24 Nov 2017 12:49:57 -0800 (PST) MIME-Version: 1.0 Sender: sunxiaoye07@gmail.com Received: by 10.157.14.167 with HTTP; Fri, 24 Nov 2017 12:49:57 -0800 (PST) In-Reply-To: References: From: Xiaoye Sun Date: Fri, 24 Nov 2017 14:49:57 -0600 X-Google-Sender-Auth: UufEaTQOT0KjRaYiw-AOlZbEhqc Message-ID: Subject: Re: swaping ring slots between NIC ring and Host ring does not always success To: Vincenzo Maffione Cc: Luigi Rizzo , "freebsd-net@freebsd.org" , Victor Detoni , Pavel Odintsov Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Nov 2017 20:49:58 -0000 Hi Vincenzo, Thanks for your reply. Let me clarify my problem. I have a program, which is an extension of bridge.c https://github.com/luigirizzo/netmap/blob/788f25dcc48dfec2e481573277b662968f690042/LINUX/ixgbe_netmap_linux.h#L377 On Wed, Nov 22, 2017 at 7:39 AM, Vincenzo Maffione wrote: > Hi, > > 2017-11-21 7:51 GMT+01:00 Xiaoye Sun : > >> Hi, >> >> Recently I found another problem with netmap. I think this new problem >> could be related to the problems in this threads so I just post the new >> problem here. >> >> In my setup, I have a sender program having a netmap ring (a pair of >> RX/TX ring) for the NIC and a ring for the host stack. The sender program >> puts customized packets (each packet has a unique sequence number and the >> sender sends the packet in a sequence number increasing order) to the NIC >> TX ring directly and also forwards the packets from the host RX ring to >> the >> NIC TX ring using "zerocopy" by swapping the buffer indices. >> However, the receiver sees duplicated customized packets. For example, in >> the case where the ring size is 32 (32 slots in a ring) the order of the >> sequence numbers the receiver see is 1,2,3,4,5,...,68,69,*70* >> ,71,72,73,...,99,100,*70*,101,102,103,... . An interesting thing I found >> is >> that the "gaps" between these two duplicated packets (70 in the example) >> are always a number very close to the ring size, 32 in this example. In my >> experiment, I use a ring with 4096 slots and the gap is always more than >> 4090 and close to 4096. I verified that this duplication happens due to >> the >> sender, not the receiver. Assuming my sender's implementation is correct, >> then this duplication may happen in netmap and the NIC driver (ixgbe). >> > > Netmap itself doesn't do any duplication nor takes a look at the packets. > It just passes > down ring->cur/ring->head to the ixgbe driver (after validation). > The ixgbe driver datapath is bypassed and replaced with a netmap-enabled > datapath (see https://github.com/luigirizzo/netmap/blob/master/LINUX/ > ixgbe_netmap_linux.h#L294-L461); > no duplication should happen there as each netmap slot (1 TX packet) is > used > only once. > >> >> >> Thinking back to the original problem in this post, I think these problems >> may be related. It seems to me that there could be multiple threads >> pulling >> the packets from the NIC TX ring (or the thread moved to other CPUs when >> the problem occurs) and these threads may run on different cores so that >> the outdated content in the buffer may be sent out when new content is >> written to the buffer. >> >> > There are no such threads pulling from the NIC TX ring. Your application > directly > puts new packets to be transmitted in the netmap buffers referenced in the > netmap TX > ring. When then you call NIOCTXSYNC or poll(), all the new TX buffers > (e.g. all > the ones from the previous value ring->head (included) to the new value of > ring->head (excluded)) > are moved to the NIC TX ring. This happens in the context of your > application thread, > no worker threads are used. Then the NIC hardware starts the transmission. > > >> I am wondering if there is a way to pin the NIC driver of the netmap >> module >> to a specific core. or is there a way to know the root of such problem? >> > > The only threads are the ones of your application. > Maybe your problem comes from concurrent accesses to the netmap TX ring > from different threads? Only one thread at a given time should update a > netmap > TX/RX ring. Otherwise the behaviour is unspecified. > > Cheers, > Vincenzo > > >> >> Best, >> Xiaoye >> >> > -- > Vincenzo Maffione >