From owner-svn-src-all@freebsd.org Mon May 30 03:31:39 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1A704B53AC9; Mon, 30 May 2016 03:31:39 +0000 (UTC) (envelope-from sephe@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B501218B0; Mon, 30 May 2016 03:31:38 +0000 (UTC) (envelope-from sephe@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u4U3VbWg075257; Mon, 30 May 2016 03:31:37 GMT (envelope-from sephe@FreeBSD.org) Received: (from sephe@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u4U3Vbqh075256; Mon, 30 May 2016 03:31:37 GMT (envelope-from sephe@FreeBSD.org) Message-Id: <201605300331.u4U3Vbqh075256@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: sephe set sender to sephe@FreeBSD.org using -f From: Sepherosa Ziehau Date: Mon, 30 May 2016 03:31:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r300981 - head/sys/netinet X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 May 2016 03:31:39 -0000 Author: sephe Date: Mon May 30 03:31:37 2016 New Revision: 300981 URL: https://svnweb.freebsd.org/changeset/base/300981 Log: tcp: Don't prematurely drop receiving-only connections If the connection was persistent and receiving-only, several (12) sporadic device insufficient buffers would cause the connection be dropped prematurely: Upon ENOBUFS in tcp_output() for an ACK, retransmission timer is started. No one will stop this retransmission timer for receiving- only connection, so the retransmission timer promises to expire and t_rxtshift is promised to be increased. And t_rxtshift will not be reset to 0, since no RTT measurement will be done for receiving-only connection. If this receiving-only connection lived long enough (e.g. >350sec, given the RTO starts from 200ms), and it suffered 12 sporadic device insufficient buffers, i.e. t_rxtshift >= 12, this receiving-only connection would be dropped prematurely by the retransmission timer. We now assert that for data segments, SYNs or FINs either rexmit or persist timer was wired upon ENOBUFS. And don't set rexmit timer for other cases, i.e. ENOBUFS upon ACKs. Discussed with: lstewart, hiren, jtl, Mike Karels MFC after: 3 weeks Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D5872 Modified: head/sys/netinet/tcp_output.c Modified: head/sys/netinet/tcp_output.c ============================================================================== --- head/sys/netinet/tcp_output.c Mon May 30 02:09:19 2016 (r300980) +++ head/sys/netinet/tcp_output.c Mon May 30 03:31:37 2016 (r300981) @@ -130,6 +130,16 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, send &VNET_NAME(tcp_autosndbuf_max), 0, "Max size of automatic send buffer"); +/* + * Make sure that either retransmit or persist timer is set for SYN, FIN and + * non-ACK. + */ +#define TCP_XMIT_TIMER_ASSERT(tp, len, th_flags) \ + KASSERT(((len) == 0 && ((th_flags) & (TH_SYN | TH_FIN)) == 0) ||\ + tcp_timer_active((tp), TT_REXMT) || \ + tcp_timer_active((tp), TT_PERSIST), \ + ("neither rexmt nor persist timer is set")) + static void inline hhook_run_tcp_est_out(struct tcpcb *tp, struct tcphdr *th, struct tcpopt *to, long len, int tso); @@ -1545,9 +1555,7 @@ timer: tp->t_softerror = error; return (error); case ENOBUFS: - if (!tcp_timer_active(tp, TT_REXMT) && - !tcp_timer_active(tp, TT_PERSIST)) - tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur); + TCP_XMIT_TIMER_ASSERT(tp, len, flags); tp->snd_cwnd = tp->t_maxseg; return (0); case EMSGSIZE: