Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 May 2016 16:36:18 -0700
From:      hiren panchasara <hiren@strugglingcoder.info>
To:        transport@FreeBSD.org
Cc:        glebius@FreeBSD.org, pkelsey@FreeBSD.org, lstewart@FreeBSD.org, killing@multiplay.co.uk
Subject:   Re: Abrupt reset sent instead of retransmitting a lost packet
Message-ID:  <20160517233618.GS44085@strugglingcoder.info>
In-Reply-To: <20160513173633.GG44085@strugglingcoder.info>

next in thread | previous in thread | raw e-mail | index | archive | help

--MT9SxUWSsctiw0kG
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 05/13/16 at 10:36P, hiren panchasara wrote:
> https://people.freebsd.org/~hiren/pcaps/tcp_weird_reset.txt
> Something we saw in the wild on 10.2ish systems (server and client both).
>=20
> The most interesting thing can be seen at the end of the file.
>=20
> 3298737767:3298739215 gets lost, client tells us about it via a bunch of
> dupacks with SACK info. It SACKs all the outstanding data but this one
> missing packet. We (server) never retransmits that missing
> packet but rather decide to send a Reset after 0.312582ms. Which somehow
> causes client to pause for 75secs. (which might be another issue and not
> particularly important for this discussion.)
>=20
> What could cause this behavior of sending a reset instead of
> retransmitting a lost packet?=20

Turns out I am finding a lot of "discarded due to memory problems" in
'netstat -sp tcp' and also net.inet.tcp.reass.overflows is rapidly
increasing.=20

This is happening in a very low RTT env (in the range of 0.20ms) and
about 1G of b/w.

So seems like following code is where reass queue is overflowing:
(I've also confirmed with tcp debug that I am seeing this message)

In tcp_reass()

        if ((th->th_seq !=3D tp->rcv_nxt || !TCPS_HAVEESTABLISHED(tp->t_sta=
te)) &&
            tp->t_segqlen >=3D (so->so_rcv.sb_hiwat / tp->t_maxseg) + 1) {
                V_tcp_reass_overflows++;
                TCPSTAT_INC(tcps_rcvmemdrop); =20
                m_freem(m);
                *tlenp =3D 0;
                if ((s =3D tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, N=
ULL))) {
                        log(LOG_DEBUG, "%s; %s: queue limit reached, "
                            "segment dropped\n", s, __func__);
                        free(s, M_TCPLOG);
                }
                return (0);
        }

I know this is a bit older (stable/10) code but I think problem still
remains.

This is the gist of this issue:
tp->t_segqlen >=3D (so->so_rcv.sb_hiwat / tp->t_maxseg) + 1 - evaluating
to be true which makes us drop packets on the floor.=20

I've tried to restore default behavior with:
net.inet.tcp.recvbuf_max: 131072
net.inet.tcp.recvbuf_inc: 16384
net.inet.tcp.sendbuf_max: 131072
net.inet.tcp.sendbuf_inc: 16384
net.inet.tcp.sendbuf_auto: 1
net.inet.tcp.sendspace: 65536
net.inet.tcp.recvspace: 65536

net.inet.tcp.reass.overflows: 156440623
net.inet.tcp.reass.cursegments: 91
net.inet.tcp.reass.maxsegments: 557900

And the app is *not* setting SO_SNDBUF or SO_RCVBUF to keep SB_AUTOSIZE
into effect.

I was assuming the usual auto-sizing would kick in and do the right
thing where we don't run into this issue but something is amiss.

I am seeing a bunch of connections to inter-colo hosts with high Recv-Q
(close to recvbuf_max) from 'netstat -an'.

I found and old issue which seems similar:
https://lists.freebsd.org/pipermail/freebsd-net/2011-August/029491.html

I am cc'ing a few folks who've touched this code of may have some idea.
Any help is appreciated.

Cheers,
Hiren

--MT9SxUWSsctiw0kG
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQF8BAABCgBmBQJXO6rvXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4
QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/lmokIAJ2p9i7/je1Uo/r5kuSvVHVu
3gEuKEePhcyQkGQ/5HTR6a0OW6qWDuyScO4CEV1yMvF3xZjHg5zGRbhtQx7T57bH
UgdTpQq88dR3t89yN69rwKVs7CofbdjjogmAiTJqKgVLVIRkqCQEozjYV1K3YvHD
zjx6IMuWoMUr/llZOQvjaffzta/8E+/1rXGVdjfrDXcu1/yoVjroGG7Oh4clFNfa
ezlvINF9/QeKzqIKzlCzY/5yDH/lY+iw0Y+vOarjKK3W7umYQsHaUC/S6+3KKA/x
KvER85vjbWrJ4lrF4rIIzzK176n5aROWTAdzAGdzqkpDNC36KanLJEZ6ok7ir2w=
=IxjI
-----END PGP SIGNATURE-----

--MT9SxUWSsctiw0kG--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160517233618.GS44085>