Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Mar 2020 21:02:13 +0100
From:      Vincenzo Maffione <vmaffione@freebsd.org>
To:        Alexandre Snarskii <snar@snar.spb.ru>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: netmap/ixl and crc addition..
Message-ID:  <CA%2B_eA9he0epk6e86_MsXGrmSr2ZVPc0B__-koFyKGtoaG9eBPg@mail.gmail.com>
In-Reply-To: <20200326102008.GA54860@staff.retn.net>
References:  <20200324123721.GA26248@staff.retn.net> <20200324140839.GB26248@staff.retn.net> <CA%2B_eA9jTbqiso_UB0D2jmRVXJXmMMkxCjnH_92FL4GW7SLMOow@mail.gmail.com> <20200326102008.GA54860@staff.retn.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Il giorno gio 26 mar 2020 alle ore 11:39 Alexandre Snarskii <
snar@snar.spb.ru> ha scritto:

> On Wed, Mar 25, 2020 at 11:31:30PM +0100, Vincenzo Maffione wrote:
> > Hi Alexandre,
> >   Thanks. Your patch looks good to me. I assume you have tested it?
>
> Sure, this patch is already handles production traffic.
>
> > I will commit that to stable/11.
>
> Thanks.
>

> >
> > The issue you report on stable/12 is more worrisome. The 'no space in TX
> ring'
> > condition (head==cur==tail) is ok per-se: on a subsequent poll() wakeup
> (e.g.
> > TX interrupt) or explicit ioctl(NIOCTXSYNC) you should see tail moving
> forward,
> > therefore freeing some space to be used in the ring.
>
> Unfortunately, further NIOCTXSYNC did not freed queue.
>
> Let me explain my observations in details: this issue was reproduceable
> even under really light load (say, hundreds pps per queue). As the load
> was ligth, I was able to insert debugging output before and after any
> calls
> to ioctl and poll, code is like:
>
> while(1) {
>         if (nm_tx_pending(txring)) {
>                 printf("%s: pre-txsync cur: %d, head: %d, tail: %d\n",
>                         thread_name, txring->cur, txring->head,
> txring->tail);
>                 if (ioctl(pa->fd, NIOCTXSYNC) == -1) {
>                         log event
>                 };
>                 printf("%s: post-txsync cur: %d, haad: %d, tail: %d\n",
>                         thread_name, txring->cur, txring->head,
> txring->tail);
>         };
>         // setup poll descriptors with POLLIN only
>         printf("%s: pre-poll cur: %d, head: %d, tail: %d\n",
>                 thread_name, txring->cur, txring->head, txring->tail);
>     ret = poll(fds, 2, 1000);
>         printf("%s: post-poll cur: %d, head: %d, tail: %d\n",
>                 thread_name, txring->cur, txring->head, txring->tail);
>
>
Ok, but I have some preliminary questions and notes:
 - You show the code that syncs the tx ring to collect completed
transmissions (and thus advance tail). However, I do not see the code that
submits new packets to be transmitted (and thus advances head/tail). Where
is it? If this is happening in a separate thread, that leads to undefined
behaviour, e.g. ring resets (although you can't crash the kernel).


> Logs in normal situation (only single thread shown):
>
> Notice:nz(ixl0:2): pre-txsync cur: 649, head: 649, tail: 647
> Notice:nz(ixl0:2): post-txsync cur: 649, head: 649, tail: 648
> Notice:nz(ixl0:2): pre-poll cur: 649, head: 649, tail: 648
> Notice:nz(ixl0:2): post-poll cur: 649, head: 649, tail: 648
>
> in pre-txsync: cur = head = tail + 2 (sometimes tail + 3), queue
> is nearly empty.
> in post-txsync: cur = head = tail + 1, queue is flushed.
>
> however, after some time, txcsync call advanced tail not to head - 1,
> but to head:
>
> Notice:nz(ixl0:2): pre-txsync cur: 654, head: 654, tail: 652
> Notice:nz(ixl0:2): post-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-poll cur: 654, head: 654, tail: 654
>

I see. This is wrong and must not happen. Once you get here, the tx ring
becomes unusable, so we may as well ignore what happens afterwards.

>
> and following txcsyncs was not able to free (already empty) queue:
>
> Notice:nz(ixl0:2): pre-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-txsync cur: 654, head: 654, tail: 654
>
> application entered busy wait.. (in my case most packets just zero-copied
> from rxring to txring, so if there are no space in txring, packets are not
> consumed from rx).
>
> > However, the ring_reinit means that something is going wrong: either your
> > application is using the TX ring incorrectly, or there is a bug in the
> netmap
>
> ring_reinit was indeed caused by my incorrect use of txring: I
> tried to ignore situation head == tail and inject packet anyway..
>

Yes, ring reinit actually resets the ring variables to a sane value.

In case your issue is not caused by unsafe access to the tx ring from
multiple threads (see above), it would help to
know what value do you have in sysctl hw.ixl.enable_head_writeback, and
also try if the problem is still
there if you enable or disable that parameter (which is relevant for how
tail is advanced).


>
> > iflib code. Since FreeBSD 12, netmap support is provided by iflib, while
> before
> > netmap support was provided directly by the ixl driver.
> > In any case, it would probably help if you could provide some more
> detailed
> > info (how to reproduce the problem).
> >
> > Cheers,
> >   Vincenzo
> >
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B_eA9he0epk6e86_MsXGrmSr2ZVPc0B__-koFyKGtoaG9eBPg>