From owner-svn-src-all@FreeBSD.ORG Tue Oct 22 18:24:35 2013 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 64F54B17; Tue, 22 Oct 2013 18:24:35 +0000 (UTC) (envelope-from andre@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 37E9A2D8E; Tue, 22 Oct 2013 18:24:35 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id r9MIOZtI072253; Tue, 22 Oct 2013 18:24:35 GMT (envelope-from andre@svn.freebsd.org) Received: (from andre@localhost) by svn.freebsd.org (8.14.7/8.14.5/Submit) id r9MIOZJt072252; Tue, 22 Oct 2013 18:24:35 GMT (envelope-from andre@svn.freebsd.org) Message-Id: <201310221824.r9MIOZJt072252@svn.freebsd.org> From: Andre Oppermann Date: Tue, 22 Oct 2013 18:24:35 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r256920 - head/sys/netinet X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Oct 2013 18:24:35 -0000 Author: andre Date: Tue Oct 22 18:24:34 2013 New Revision: 256920 URL: http://svnweb.freebsd.org/changeset/base/256920 Log: The TCP delayed ACK logic isn't aware of LRO passing up large aggregated segments thinking it received only one segment. This causes it to enable the delay the ACK for 100ms to wait for another segment which may never come because all the data was received already. Doing delayed ACK for LRO segments is bogus for two reasons: a) it pushes us further away from acking every other packet; b) it introduces additional delay in responding to the sender. The latter is especially bad because it is in the nature of LRO to aggregated all segments of a burst with no more coming until an ACK is sent back. Change the delayed ACK logic to detect LRO segments by being larger than the MSS for this connection and issuing an immediate ACK for them to keep the ACK clock ticking without interruption. Reported by: julian, cperciva Tested by: cperciva Reviewed by: lstewart MFC after: 3 days Modified: head/sys/netinet/tcp_input.c Modified: head/sys/netinet/tcp_input.c ============================================================================== --- head/sys/netinet/tcp_input.c Tue Oct 22 18:14:06 2013 (r256919) +++ head/sys/netinet/tcp_input.c Tue Oct 22 18:24:34 2013 (r256920) @@ -508,10 +508,13 @@ do { \ * the ack that opens up a 0-sized window and * - delayed acks are enabled or * - this is a half-synchronized T/TCP connection. + * - the segment size is not larger than the MSS and LRO wasn't used + * for this segment. */ -#define DELAY_ACK(tp) \ +#define DELAY_ACK(tp, tlen) \ ((!tcp_timer_active(tp, TT_DELACK) && \ (tp->t_flags & TF_RXWIN0SENT) == 0) && \ + (tlen <= tp->t_maxopd) && \ (V_tcp_delack_enabled || (tp->t_flags & TF_NEEDSYN))) /* @@ -1863,7 +1866,7 @@ tcp_do_segment(struct mbuf *m, struct tc } /* NB: sorwakeup_locked() does an implicit unlock. */ sorwakeup_locked(so); - if (DELAY_ACK(tp)) { + if (DELAY_ACK(tp, tlen)) { tp->t_flags |= TF_DELACK; } else { tp->t_flags |= TF_ACKNOW; @@ -1954,7 +1957,7 @@ tcp_do_segment(struct mbuf *m, struct tc * If there's data, delay ACK; if there's also a FIN * ACKNOW will be turned on later. */ - if (DELAY_ACK(tp) && tlen != 0) + if (DELAY_ACK(tp, tlen) && tlen != 0) tcp_timer_activate(tp, TT_DELACK, tcp_delacktime); else @@ -2926,7 +2929,7 @@ dodata: /* XXX */ if (th->th_seq == tp->rcv_nxt && LIST_EMPTY(&tp->t_segq) && TCPS_HAVEESTABLISHED(tp->t_state)) { - if (DELAY_ACK(tp)) + if (DELAY_ACK(tp, tlen)) tp->t_flags |= TF_DELACK; else tp->t_flags |= TF_ACKNOW;