Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Mar 2018 04:21:56 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 227100] [epair] epair interface stops working when it reaches the hardware queue limit
Message-ID:  <bug-227100-8@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D227100

            Bug ID: 227100
           Summary: [epair] epair interface stops working when it reaches
                    the hardware queue limit
           Product: Base System
           Version: CURRENT
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs@FreeBSD.org
          Reporter: reshadpatuck1@gmail.com

Created attachment 191964
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D191964&action=
=3Dedit
the output of netstat and dtrace

When the epair interface reaches the hardware queue limit, epairs stop
transferring data.

This bug refers to this mailing list conversation
https://lists.freebsd.org/pipermail/freebsd-net/2018-March/050077.html

So far using the patch/if_epair source file attached to this bug we can tell
that the error occurs in this block of code

```
        if ((epair_dpcpu->epair_drv_flags & IFF_DRV_OACTIVE) !=3D 0) {
                /*
                 * Our hardware queue is full, try to fall back
                 * queuing to the ifq but do not call ifp->if_start.
                 * Either we are lucky or the packet is gone.
                 */
                IFQ_ENQUEUE(&ifp->if_snd, m, error);
                if (!error)
                        (void)epair_add_ifp_for_draining(ifp);

                SDT_PROBE3(if_epair, transmit, epair_transmit_locked, enque=
ued,
                                ifp, m, error);
                return (error);
        }
```

Where the value of the 'error' is 55.

Setting 'net.link.epair.netisr_maxqlen' to a very small value makes this oc=
cur
faster.

This issue seems to be happening in the wild only on one of my servers.
Other servers under more load in different environments do not seem to exhi=
bit
this behaviour.

@Kristof please chime in if I have missed something out.

Attached:
- commands.txt
- epair-sdt-diff.patch=20
- epair_transmit_locked:enqueued-error-code.d
- if_epair.c

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-227100-8>