From owner-svn-src-stable@freebsd.org Mon Aug 6 17:41:54 2018 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C24A1105F948; Mon, 6 Aug 2018 17:41:54 +0000 (UTC) (envelope-from jtl@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 775238116A; Mon, 6 Aug 2018 17:41:54 +0000 (UTC) (envelope-from jtl@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 5975C10414; Mon, 6 Aug 2018 17:41:54 +0000 (UTC) (envelope-from jtl@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w76Hfs6F013564; Mon, 6 Aug 2018 17:41:54 GMT (envelope-from jtl@FreeBSD.org) Received: (from jtl@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w76Hfs19013563; Mon, 6 Aug 2018 17:41:54 GMT (envelope-from jtl@FreeBSD.org) Message-Id: <201808061741.w76Hfs19013563@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jtl set sender to jtl@FreeBSD.org using -f From: "Jonathan T. Looney" Date: Mon, 6 Aug 2018 17:41:54 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r337385 - in stable/11: share/man/man4 sys/netinet X-SVN-Group: stable-11 X-SVN-Commit-Author: jtl X-SVN-Commit-Paths: in stable/11: share/man/man4 sys/netinet X-SVN-Commit-Revision: 337385 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Aug 2018 17:41:55 -0000 Author: jtl Date: Mon Aug 6 17:41:53 2018 New Revision: 337385 URL: https://svnweb.freebsd.org/changeset/base/337385 Log: MFC r337384: Address concerns about CPU usage while doing TCP reassembly. Currently, the per-queue limit is a function of the receive buffer size and the MSS. In certain cases (such as connections with large receive buffers), the per-queue segment limit can be quite large. Because we process segments as a linked list, large queues may not perform acceptably. The better long-term solution is to make the queue more efficient. But, in the short-term, we can provide a way for a system administrator to set the maximum queue size. We set the default queue limit to 100. This is an effort to balance performance with a sane resource limit. Depending on their environment, goals, etc., an administrator may choose to modify this limit in either direction. Reviewed by: jhb Approved by: so Security: FreeBSD-SA-18:08.tcp Security: CVE-2018-6922 Modified: stable/11/share/man/man4/tcp.4 stable/11/sys/netinet/tcp_reass.c Directory Properties: stable/11/ (props changed) Modified: stable/11/share/man/man4/tcp.4 ============================================================================== --- stable/11/share/man/man4/tcp.4 Mon Aug 6 17:36:57 2018 (r337384) +++ stable/11/share/man/man4/tcp.4 Mon Aug 6 17:41:53 2018 (r337385) @@ -445,6 +445,20 @@ no reseeding will occur. Reseeding should not be necessary, and will break .Dv TIME_WAIT recycling for a few minutes. +.It Va reass.cursegments +The current total number of segments present in all reassembly queues. +.It Va reass.maxsegments +The maximum limit on the total number of segments across all reassembly +queues. +The limit can be adjusted as a tunable. +.It Va reass.maxqueuelen +The maximum number of segments allowed in each reassembly queue. +By default, the system chooses a limit based on each TCP connection's +receive buffer size and maximum segment size (MSS). +The actual limit applied to a session's reassembly queue will be the lower of +the system-calculated automatic limit and the user-specified +.Va reass.maxqueuelen +limit. .It Va rexmit_min , rexmit_slop Adjust the retransmit timer calculation for .Tn TCP . Modified: stable/11/sys/netinet/tcp_reass.c ============================================================================== --- stable/11/sys/netinet/tcp_reass.c Mon Aug 6 17:36:57 2018 (r337384) +++ stable/11/sys/netinet/tcp_reass.c Mon Aug 6 17:41:53 2018 (r337385) @@ -89,6 +89,11 @@ SYSCTL_UMA_CUR(_net_inet_tcp_reass, OID_AUTO, cursegme &tcp_reass_zone, "Global number of TCP Segments currently in Reassembly Queue"); +static u_int tcp_reass_maxqueuelen = 100; +SYSCTL_UINT(_net_inet_tcp_reass, OID_AUTO, maxqueuelen, CTLFLAG_RWTUN, + &tcp_reass_maxqueuelen, 0, + "Maximum number of TCP Segments per Reassembly Queue"); + /* Initialize TCP reassembly queue */ static void tcp_reass_zone_change(void *tag) @@ -168,6 +173,10 @@ tcp_reass(struct tcpcb *tp, struct tcphdr *th, int *tl * socket receive buffer determines our advertised window and grows * automatically when socket buffer autotuning is enabled. Use it as the * basis for our queue limit. + * + * However, allow the user to specify a ceiling for the number of + * segments in each queue. + * * Always let the missing segment through which caused this queue. * NB: Access to the socket buffer is left intentionally unlocked as we * can tolerate stale information here. @@ -178,7 +187,8 @@ tcp_reass(struct tcpcb *tp, struct tcphdr *th, int *tl * is understood. */ if ((th->th_seq != tp->rcv_nxt || !TCPS_HAVEESTABLISHED(tp->t_state)) && - tp->t_segqlen >= (so->so_rcv.sb_hiwat / tp->t_maxseg) + 1) { + tp->t_segqlen >= min((so->so_rcv.sb_hiwat / tp->t_maxseg) + 1, + tcp_reass_maxqueuelen)) { TCPSTAT_INC(tcps_rcvreassfull); *tlenp = 0; if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) {