Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Feb 2016 18:04:13 -0600
From:      Xiaoye Sun <Xiaoye.Sun@rice.edu>
To:        Victor Detoni <victordetoni@gmail.com>
Cc:        Luigi Rizzo <rizzo@iet.unipi.it>, Pavel Odintsov <pavel.odintsov@gmail.com>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: swaping ring slots between NIC ring and Host ring does not always success
Message-ID:  <CAJnByzgjEEAzmWZu7BsSWHXmpjUtZcqXFGN8umCqmvgME1Jv%2BA@mail.gmail.com>
In-Reply-To: <CANpwN=uHk-VwOoFz7NaPE9A-0B=MAapqxJ-uyCBtn=oMdacYnw@mail.gmail.com>
References:  <CAJnByzj6Dj3vouZ2NbxqvCV-2-7TVtTR4FaWKuCFaaRN2X%2ByAA@mail.gmail.com> <CALgsdbd3XuE3wMYp4ey%2B1aer%2BHSVNojLYoVqwqTBPAXXdf9i%2BQ@mail.gmail.com> <CAJnByzirLXdCe-kwHV2s_E6ytGJG0Dth=0Ms12RrEk7FK_%2B8Og@mail.gmail.com> <CA%2BhQ2%2BgMWY0eabjHGw0=PJCAkS-wO=RBrN5brSbaqWc3_AOYoQ@mail.gmail.com> <CAJnByziBS8o6LtmpUrUu5xtRUd008Z2hnCsp=WVFv35r2J0rHw@mail.gmail.com> <CA%2BhQ2%2Bim9nFfYnqDS2HgRbAzdf5D0iaLCmCYhfXQVVRMouUFuw@mail.gmail.com> <CAJnByzht-qfDcm8oEg1aSRyVBZ1ygPvc2eMuoyJcq4geueTZ0Q@mail.gmail.com> <CA%2BhQ2%2BiERgWJ=cdFB-cByfT3r11T1kKr-5HiuCYZY-rxbjf=XA@mail.gmail.com> <CAJnByziDzdR2C6DcSRNPtrWACLq0XFpe4X1Ek9yXtFP9ivqWQw@mail.gmail.com> <CA%2BhQ2%2BhjnuGo1xKgc8CQ7gP35tiaZG7%2BroZBmX8aBgb8qWnLgg@mail.gmail.com> <CAJnByzh-VrRZeYdpkRFtCUGEN_arFBkemcN7byb51XV6UPswyg@mail.gmail.com> <CA%2BhQ2%2BiMw3kxjpcZy77vgOEsfk2UY0-farh9C8RKXZHMU7D8kw@mail.gmail.com> <CAJnByzgsuNBhdfPJsGrrHcU79xjK%2Bdq2RENgUkbZcehFm8MUxg@mail.gmail.com> <CAJnByzgNZ9YsYd7tBgYxiQPvuS_VZbhZNGvsPS-0apCDga7XFA@mail.gmail.com> <CANpwN=uHk-VwOoFz7NaPE9A-0B=MAapqxJ-uyCBtn=oMdacYnw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Yes. all the interfaces are up. Are you able to get ARP request when the
interfaces are down?

On Thursday, February 4, 2016, Victor Detoni <victordetoni@gmail.com> wrote:

> Both interfaces are up? Like ifconfig... up
>
> I had this the same problem and I solve with commands above
>
> Em quinta-feira, 4 de fevereiro de 2016, Xiaoye Sun <Xiaoye.Sun@rice.edu
> <javascript:_e(%7B%7D,'cvml','Xiaoye.Sun@rice.edu');>> escreveu:
>
>> Hi Luigi,
>>
>> Thanks for your explanation.
>>
>> I used three machines to do this experiment. They are directly connected.
>>
>> [(machine1) eth1]---[eth2 (machine2) eth3]---[eth4 (machine3)].
>>
>> First, I tried to run bridge.c on machine2 using the command *bridge -i
>> netmap:eth2 -i netmap:eth3*. (sender receiver or XYZ were not running on
>> machine 1or3)
>>
>> For my understanding, in this setup, machine2 will be transparent to
>> machine1&3 since it forwards packet from its eth2 to eth3 and vice versa
>> without any modification to the packets.
>>
>> I tried to ping machine 3 from machine 1 using the command like *ping
>> 10.11.10.3*. However, it still does not success.
>> This is because that before machine1 sends ping message to machine3, it
>> will first send a ARP request message to get the mac address of machine3.
>> machine3 gets that ARP request, and send the reply back (I use tcpdump to
>> verify that machine3 gets the ARP request and send out the ARP reply).
>> However, machine1 does not get the ARP reply.
>>
>> I checked that the bridge can only forwarding packet in one direction at
>> the same time. it gets the ARP request but doesn't see the ARP reply
>> (*pkt_queued* always returns 0 for one nic...).
>>
>> This behavior looks very weird to me. Do you think there is a
>> compatibility
>> issues between netmap and the os I am using? Is there a verified linux
>> distribution (also the version) that perfectly works well with netmap?
>>
>> The OS I use is 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24)
>> x86_64 GNU/Linux.
>> Linux kernel version is *3.16.0-4-amd64*
>>
>>
>> Thanks!
>> Xiaoye
>>
>>
>>
>>
>>
>>
>> On Wed, Feb 3, 2016 at 2:12 AM, Luigi Rizzo <rizzo@iet.unipi.it> wrote:
>>
>> > On Tue, Feb 2, 2016 at 10:48 PM, Xiaoye Sun <Xiaoye.Sun@rice.edu>
>> wrote:
>> > >
>> > >
>> > > On Mon, Feb 1, 2016 at 11:34 PM, Luigi Rizzo <rizzo@iet.unipi.it>
>> wrote:
>> > >>
>> > >> On Tue, Feb 2, 2016 at 6:23 AM, Xiaoye Sun <Xiaoye.Sun@rice.edu>
>> wrote:
>> > >> > Hi Luigi,
>> > >> >
>> > >> > I have to clarify about the *jumping issue* about the slot indexes.
>> > >> > In the bridge.c program, the slot index never jumps and it
>> increases
>> > >> > sequentially.
>> > >> > In the receiver.c program, the udp packet seq jumps and I showed
>> the
>> > >> > slot
>> > >> > index that each udp packet uses. So the slot index jumps together
>> with
>> > >> > the
>> > >> > udp seq (at the receiver program only).
>> > >>
>> > >> So let me understand, is the "slot" some information written
>> > >> in the packet by bridge.c (referring to the rx or tx slot,
>> > >> I am not sure) and then read and printed by receiver.c
>> > >> (which gets the packet through recvfrom so there isn't
>> > >> really any slot index) ?
>> > >>
>> > > It works in the other way:
>> > > The bridge.c checks the seq numbers of the udp packets in netmap slots
>> > (in
>> > > nic rx ring) before the swap; then it records the seq number, slot
>> > > number(both rx and tx (tx indexes were not shown in the previous email
>> > since
>> > > they all look correct)) and buf_idx (rx and tx). The bridge.c does not
>> > > change anything in the buffer and it knows the slot and buf_idx that a
>> > > packet uses. Please refer to the added code in *process_rings*
>> function
>> > > http://www.owlnet.rice.edu/~xs6/bridge.c
>> > > The receiver.c checks the seq numbers only and print out the seq
>> numbers
>> > it
>> > > receive sequentially.
>> > > With these information, I manually match the seq number I got from
>> > > receiver.c and the seq number I got from bridge.c. So we know what is
>> the
>> > > seq order the receiver sees and which slot a packet uses when bridge.c
>> > swaps
>> > > the buf_idxs.
>> > >
>> > >> Do you see any ordering inversion when the receiver
>> > >> gets packets through the NETMAP API (e.g. using bridge.c
>> > >> instead of receiver.c) ?
>> > >>
>> > > There is no ordering inversion seen by bridge.c (As I said in the
>> > previous
>> > > paragraph, the bridge.c checks the seq number and I did not see any
>> order
>> > > inversion in THIS simple experiment (In my multicast protocol
>> (mentioned
>> > in
>> > > the first email), there is ordering inversion. But let us solve the
>> > simple
>> > > bridge.c's problem first. I think they are two relatively independent
>> > > issues.)).
>> >
>> > Sorry there was a misunderstanding.
>> > I wanted you to check the following setup:
>> >
>> > [1: send.c] ->- [2: bridge.c] ->- [3: XYZ]
>> >
>> > where in XYZ you replace your receiver.c with some
>> > netmap-based receiver (it could be pkt-gen in rx mode,
>> > or possibly even another instance of bridge.c where
>> > you connect the output port to a vale switch so
>> > traffic is dropped), and then in XYZ print the content
>> > of the packets.
>> >
>> > From your previous report we know that node 2: sees packets
>> > in order, and node 3: sees packets out of order.
>> > However, if the problem were due to bridge.c sending
>> > the old buffer and not the new one, you'd see not only
>> > reordering but also replication of packets.
>> >
>> > The fact that you see only the reordering in 3: makes
>> > me think that the problem is in that node, and it could
>> > be the network stack in 3: that does something strange.
>> > So if you can run something netmap based in 3: and make
>> > sure there is only one queue to read from, we could
>> > at least figure out what is going on.
>> >
>> > cheers
>> > luigi
>> >
>> >
>> > is that
>> > >
>> > >>
>> > >> Are you using native netmap drivers or the emulated mode ?
>> > >> You can check that by playing with the "admode" sysctl entry
>> > >> (or sysfs on linux) - try setting to 1 and 2 and see if
>> > >> the behaviour changes.
>> > >>
>> > >>      dev.netmap.admode: 0
>> > >>              Controls the use of native or emulated adapter mode.
>> > >>              0 uses the best available option,
>> > >>              1 forces native and fails if not available,
>> > >>              2 forces emulated hence never fails.
>> > >>
>> > > I was using admode 0. I changed the admode to 1 and 2 using the
>> command
>> > like
>> > > *echo 1 > /sys/module/netmap/parameters/admode* and restart the bridge
>> > > program. The behavior keeps the same.
>> > >
>> > >>
>> > >> cheers
>> > >> luigi
>> > >>
>> > >> >
>> > >> > There is really one ring (tx and rx) for NIC and one ring (tx and
>> rx)
>> > >> > for
>> > >> > the host.
>> > >> > I also doubt that there might be multiple tx rings for the host. It
>> > >> > seems
>> > >> > like that bridge program swap packet to multiple host rings and the
>> > udp
>> > >> > recv
>> > >> > program drains packets from these rings. But this is not the case
>> > here.
>> > >> >
>> > >> > The bridge program prints a line like this
>> > >> > *515.277263 main [277] Ready to go, eth3 0x1/1 <-> eth3 0x0/1.*
>> > >> > this is printed by the following line the original program
>> > >> > *D("Ready to go, %s 0x%x/%d <-> %s 0x%x/%d.", pa->req.nr_name,
>> > >> > pa->first_rx_ring, pa->req.nr_rx_rings, pb->req.nr_name,
>> > >> > pb->first_rx_ring,
>> > >> > pb->req.nr_rx_rings);*
>> > >> >
>> > >> > I think this shows that there is really one NIC ring and one HOST
>> > ring.
>> > >> >
>> > >> > Is there another way to verify the number of ring that netmap has?
>> > >> >
>> > >> > Thanks!
>> > >> > Xiaoye
>> > >> >
>> > >> > On Mon, Feb 1, 2016 at 10:48 PM, Luigi Rizzo <rizzo@iet.unipi.it>
>> > wrote:
>> > >> >>
>> > >> >> Hi,
>> > >> >> there must be some wrong with your setting because
>> > >> >> slot indexes must be sequential and in your case they
>> > >> >> are not (see the jump from 295 to 474 and then
>> > >> >> back from 485 to 296, and the numerous interleavings
>> > >> >> that you are seeing later).
>> > >> >>
>> > >> >> I have no idea of the cause but typically this pattern
>> > >> >> is what you see when there are multiple input rings and
>> > >> >> not just one.
>> > >> >>
>> > >> >> Cheers
>> > >> >> Luigi
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >> On Tue, Feb 2, 2016 at 12:24 AM, Xiaoye Sun <Xiaoye.Sun@rice.edu>
>> > >> >> wrote:
>> > >> >> > Hi Luigi,
>> > >> >> >
>> > >> >> > Thanks for the detailed advice.
>> > >> >> >
>> > >> >> > With more detailed experiments, actually I found that the udp
>> > >> >> > sender/receiver packet reorder issue *might* be irrelevant to
>> the
>> > >> >> > original
>> > >> >> > issue I posted. However, I think we should solve the udp
>> > >> >> > sender/receiver
>> > >> >> > issue first.
>> > >> >> > I run the experiment with more detailed log. Here is my
>> findings.
>> > >> >> >
>> > >> >> > 1. I am running a netmap version available since about Oct 13rd
>> > from
>> > >> >> > github
>> > >> >> > (https://github.com/luigirizzo/netmap). So I think this is not
>> the
>> > >> >> > one
>> > >> >> > related to the buffer allocation issue. I tried to running the
>> > newest
>> > >> >> > version, however, that version causes problem when I exit the
>> > bridge
>> > >> >> > program
>> > >> >> > (something like kernel error which make the os crash).
>> > >> >> >
>> > >> >> > 2 & 3. I changed the receiver.c & bridge.c so that I can get
>> more
>> > >> >> > information (more detailed log).
>> > >> >> > The reorder happens multiple times (about 10 times) within a
>> > second.
>> > >> >> > Here is
>> > >> >> > one example trace collected from the above two programs.
>> > (remembering
>> > >> >> > that
>> > >> >> > we have udp sender running on one machine; netmap bridge and udp
>> > >> >> > receiver
>> > >> >> > are running on another machine).
>> > >> >> > There is only one pair of rings each with 512 slots (511 slot
>> > usable)
>> > >> >> > on
>> > >> >> > the
>> > >> >> > receiver machine.
>> > >> >> >
>> > >> >> > =================== packet trace collected from receiver.c
>> > >> >> > ===================
>> > >> >> > ===== together with the slot and buf_idx of the corresponding
>> > netmap
>> > >> >> > ring
>> > >> >> > slots ======
>> > >> >> > [seq]   [slot]   [buf_idx]
>> > >> >> > 8208   294    1833
>> > >> >> > 8209   295    1834
>> > >> >> > 8388   474    2013
>> > >> >> > ... (packet received in order)
>> > >> >> > 8398   484    2023
>> > >> >> > 8399   485    2024
>> > >> >> > 8210   296    1835
>> > >> >> > 8211   297    1836
>> > >> >> > ... (packet received in order)
>> > >> >> > ...
>> > >> >> > 8222   308    1847
>> > >> >> > 8400   486    2025
>> > >> >> > 8223   309    1848
>> > >> >> > 8401   487    2026
>> > >> >> > 8224   310    1849
>> > >> >> > 8402   488    2027
>> > >> >> > 8225   311    1850
>> > >> >> > 8403   489    2028
>> > >> >> > 8226   312    1851
>> > >> >> > 8404   450    2029
>> > >> >> > 8227   313    1852
>> > >> >> > 8228   314    1853
>> > >> >> >
>> ===================================================================
>> > >> >> > As we can see that the udp receiver got packet 8210 after it got
>> > >> >> > 8399,
>> > >> >> > which
>> > >> >> > is the first reorder. Then, the receiver got 8211 to 8222
>> > >> >> > sequentially.
>> > >> >> > Then
>> > >> >> > it got packet from 8223-8227 and 8400-8404 interleaved.
>> > >> >> >
>> > >> >> >
>> > >> >> > ==================== event order seen by netmap bridge
>> > >> >> > ==================
>> > >> >> > get 8209
>> > >> >> > poll called
>> > >> >> > get 8210
>> > >> >> > ...
>> > >> >> > ...
>> > >> >> > get 8228
>> > >> >> > poll called
>> > >> >> > get 8229
>> > >> >> > ...
>> > >> >> > ...
>> > >> >> > get 8383
>> > >> >> > poll called
>> > >> >> > get 8384
>> > >> >> > ...
>> > >> >> > get 8387
>> > >> >> > poll called
>> > >> >> > get 8388
>> > >> >> > ...
>> > >> >> > get 8393
>> > >> >> > poll called
>> > >> >> > get 8394
>> > >> >> > ...
>> > >> >> > get 8399
>> > >> >> > poll called
>> > >> >> > get 8400
>> > >> >> > ...
>> > >> >> > get 8404
>> > >> >> > poll called
>> > >> >> > get 8405
>> > >> >> >
>> ===================================================================
>> > >> >> > As we can see, from the event ordering see by the bridge.c, all
>> the
>> > >> >> > packets
>> > >> >> > are receiver in order, which means the the reorder happens when
>> the
>> > >> >> > bridge
>> > >> >> > code swap the buf_idx between the nic ring(slot) and the host
>> > >> >> > ring(slot).
>> > >> >> > The reordered seq usually right before or after the poll
>> function
>> > >> >> > call.
>> > >> >> >
>> > >> >> > Best,
>> > >> >> > Xiaoye
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> > On Fri, Jan 29, 2016 at 4:27 PM, Luigi Rizzo <
>> rizzo@iet.unipi.it>
>> > >> >> > wrote:
>> > >> >> >>
>> > >> >> >> On Fri, Jan 29, 2016 at 2:12 PM, Xiaoye Sun <
>> Xiaoye.Sun@rice.edu>
>> > >> >> >> wrote:
>> > >> >> >> > Hi Luigi,
>> > >> >> >> >
>> > >> >> >> > Thanks for your advice.
>> > >> >> >> > I forgot to mention that I use the command "ethtool -L eth1
>> > >> >> >> > combined
>> > >> >> >> > 1"
>> > >> >> >> > to
>> > >> >> >> > set the number of rings of the nic to 1.  The host also only
>> has
>> > >> >> >> > one
>> > >> >> >> > ring.
>> > >> >> >> > I understand the situation where the first tx ring is full so
>> > the
>> > >> >> >> > bridge
>> > >> >> >> > will swap the packets to the second tx ring and then the
>> > host/nic
>> > >> >> >> > might
>> > >> >> >> > drain either rings. But this is not the case in the
>> experiment.
>> > >> >> >>
>> > >> >> >> ok good to know that.
>> > >> >> >>
>> > >> >> >> So if we have ruled out multiqueue and iommu, let's look at
>> > >> >> >> the internal allocator and at bridge.c
>> > >> >> >>
>> > >> >> >> 1. are you running the most recent version of netmap ?
>> > >> >> >>    Some older version (probably 1-2 years ago) had a bug
>> > >> >> >>    in the buffer allocator and some buffers were allocated
>> > >> >> >>    twice.
>> > >> >> >>
>> > >> >> >> 2. can you tweak your receiver.c to report some more info
>> > >> >> >>    on how often you get out of sequence packets, how much
>> > >> >> >>    out of sequence they are ?
>> > >> >> >>    Also it would be useful to report gaps on the increasing
>> side
>> > >> >> >>    (i.e. new_seq != old_seq +1 )
>> > >> >> >>
>> > >> >> >> 3. can you tweak bridge.c so that it writes into the packet
>> > >> >> >>    the netmap buffer indexes and slots on the rx and tx side,
>> > >> >> >>    so when you detect a sequence error we can figure out
>> > >> >> >>    where it is happening.
>> > >> >> >>    Ideally you could also add the sequence number detection
>> > >> >> >>    code in bridge.c so we can check whether the errors appear
>> > >> >> >>    on the input or output sides.
>> > >> >> >>
>> > >> >> >> cheers
>> > >> >> >> luigi
>> > >> >> >>
>> > >> >> >
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >> --
>> > >> >>
>> > >> >>
>> >
>> -----------------------------------------+-------------------------------
>> > >> >>  Prof. Luigi RIZZO, rizzo@iet.unipi.it  . Dip. di Ing.
>> > >> >> dell'Informazione
>> > >> >>  http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
>> > >> >>  TEL      +39-050-2217533               . via Diotisalvi 2
>> > >> >>  Mobile   +39-338-6809875               . 56122 PISA (Italy)
>> > >> >>
>> > >> >>
>> >
>> -----------------------------------------+-------------------------------
>> > >> >>
>> > >> >
>> > >>
>> > >>
>> > >>
>> > >> --
>> > >>
>> >
>> -----------------------------------------+-------------------------------
>> > >>  Prof. Luigi RIZZO, rizzo@iet.unipi.it  . Dip. di Ing.
>> > dell'Informazione
>> > >>  http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
>> > >>  TEL      +39-050-2217533               . via Diotisalvi 2
>> > >>  Mobile   +39-338-6809875               . 56122 PISA (Italy)
>> > >>
>> >
>> -----------------------------------------+-------------------------------
>> > >>
>> > >
>> >
>> >
>> >
>> > --
>> >
>> -----------------------------------------+-------------------------------
>> >  Prof. Luigi RIZZO, rizzo@iet.unipi.it  . Dip. di Ing.
>> dell'Informazione
>> >  http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
>> >  TEL      +39-050-2217533               . via Diotisalvi 2
>> >  Mobile   +39-338-6809875               . 56122 PISA (Italy)
>> >
>> -----------------------------------------+-------------------------------
>> >
>> >
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJnByzgjEEAzmWZu7BsSWHXmpjUtZcqXFGN8umCqmvgME1Jv%2BA>