From owner-svn-src-all@FreeBSD.ORG Fri Aug 27 10:15:46 2010 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3636E10656A4; Fri, 27 Aug 2010 10:15:46 +0000 (UTC) (envelope-from andre@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 246678FC12; Fri, 27 Aug 2010 10:15:46 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id o7RAFkAm072133; Fri, 27 Aug 2010 10:15:46 GMT (envelope-from andre@svn.freebsd.org) Received: (from andre@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id o7RAFknT072130; Fri, 27 Aug 2010 10:15:46 GMT (envelope-from andre@svn.freebsd.org) Message-Id: <201008271015.o7RAFknT072130@svn.freebsd.org> From: Andre Oppermann Date: Fri, 27 Aug 2010 10:15:46 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r211870 - stable/8/sys/netinet X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Aug 2010 10:15:46 -0000 Author: andre Date: Fri Aug 27 10:15:45 2010 New Revision: 211870 URL: http://svn.freebsd.org/changeset/base/211870 Log: MFC r211464: If a TCP connection has been idle for one retransmit timeout or more it must reset its congestion window back to the initial window. RFC3390 has increased the initial window from 1 segment to up to 4 segments. The initial window increase of RFC3390 wasn't reflected into the restart window which remained at its original defaults of 4 segments for local and 1 segment for all other connections. Both values are controllable through sysctl net.inet.tcp.local_slowstart_flightsize and net.inet.tcp.slowstart_flightsize. The increase helps TCP's slow start algorithm to open up the congestion window much faster. Reviewed by: lstewart MFC after: 1 week Modified: stable/8/sys/netinet/tcp_output.c stable/8/sys/netinet/tcp_var.h Directory Properties: stable/8/sys/ (props changed) stable/8/sys/amd64/include/xen/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) stable/8/sys/contrib/dev/acpica/ (props changed) stable/8/sys/contrib/pf/ (props changed) stable/8/sys/dev/xen/xenpci/ (props changed) Modified: stable/8/sys/netinet/tcp_output.c ============================================================================== --- stable/8/sys/netinet/tcp_output.c Fri Aug 27 09:59:51 2010 (r211869) +++ stable/8/sys/netinet/tcp_output.c Fri Aug 27 10:15:45 2010 (r211870) @@ -140,7 +140,7 @@ tcp_output(struct tcpcb *tp) { struct socket *so = tp->t_inpcb->inp_socket; long len, recwin, sendwin; - int off, flags, error; + int off, flags, error, rw; struct mbuf *m; struct ip *ip = NULL; struct ipovly *ipov = NULL; @@ -176,23 +176,34 @@ tcp_output(struct tcpcb *tp) idle = (tp->t_flags & TF_LASTIDLE) || (tp->snd_max == tp->snd_una); if (idle && ticks - tp->t_rcvtime >= tp->t_rxtcur) { /* - * We have been idle for "a while" and no acks are - * expected to clock out any data we send -- - * slow start to get ack "clock" running again. + * If we've been idle for more than one retransmit + * timeout the old congestion window is no longer + * current and we have to reduce it to the restart + * window before we can transmit again. * - * Set the slow-start flight size depending on whether - * this is a local network or not. + * The restart window is the initial window or the last + * CWND, whichever is smaller. + * + * This is done to prevent us from flooding the path with + * a full CWND at wirespeed, overloading router and switch + * buffers along the way. + * + * See RFC5681 Section 4.1. "Restarting Idle Connections". */ - int ss = V_ss_fltsz; + if (V_tcp_do_rfc3390) + rw = min(4 * tp->t_maxseg, + max(2 * tp->t_maxseg, 4380)); #ifdef INET6 - if (isipv6) { - if (in6_localaddr(&tp->t_inpcb->in6p_faddr)) - ss = V_ss_fltsz_local; - } else -#endif /* INET6 */ - if (in_localaddr(tp->t_inpcb->inp_faddr)) - ss = V_ss_fltsz_local; - tp->snd_cwnd = tp->t_maxseg * ss; + else if ((isipv6 ? in6_localaddr(&tp->t_inpcb->in6p_faddr) : + in_localaddr(tp->t_inpcb->inp_faddr))) +#else + else if (in_localaddr(tp->t_inpcb->inp_faddr)) +#endif + rw = V_ss_fltsz_local * tp->t_maxseg; + else + rw = V_ss_fltsz * tp->t_maxseg; + + tp->snd_cwnd = min(rw, tp->snd_cwnd); } tp->t_flags &= ~TF_LASTIDLE; if (idle) { Modified: stable/8/sys/netinet/tcp_var.h ============================================================================== --- stable/8/sys/netinet/tcp_var.h Fri Aug 27 09:59:51 2010 (r211869) +++ stable/8/sys/netinet/tcp_var.h Fri Aug 27 10:15:45 2010 (r211870) @@ -556,6 +556,7 @@ extern int tcp_log_in_vain; VNET_DECLARE(int, tcp_mssdflt); /* XXX */ VNET_DECLARE(int, tcp_minmss); VNET_DECLARE(int, tcp_delack_enabled); +VNET_DECLARE(int, tcp_do_rfc3390); VNET_DECLARE(int, tcp_do_newreno); VNET_DECLARE(int, path_mtu_discovery); VNET_DECLARE(int, ss_fltsz); @@ -566,6 +567,7 @@ VNET_DECLARE(int, ss_fltsz_local); #define V_tcp_mssdflt VNET(tcp_mssdflt) #define V_tcp_minmss VNET(tcp_minmss) #define V_tcp_delack_enabled VNET(tcp_delack_enabled) +#define V_tcp_do_rfc3390 VNET(tcp_do_rfc3390) #define V_tcp_do_newreno VNET(tcp_do_newreno) #define V_path_mtu_discovery VNET(path_mtu_discovery) #define V_ss_fltsz VNET(ss_fltsz)