From owner-svn-src-all@FreeBSD.ORG Fri Oct 7 16:39:03 2011 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9E084106566C; Fri, 7 Oct 2011 16:39:03 +0000 (UTC) (envelope-from andre@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 83D898FC18; Fri, 7 Oct 2011 16:39:03 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id p97Gd3VP019130; Fri, 7 Oct 2011 16:39:03 GMT (envelope-from andre@svn.freebsd.org) Received: (from andre@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id p97Gd3t4019128; Fri, 7 Oct 2011 16:39:03 GMT (envelope-from andre@svn.freebsd.org) Message-Id: <201110071639.p97Gd3t4019128@svn.freebsd.org> From: Andre Oppermann Date: Fri, 7 Oct 2011 16:39:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r226113 - head/sys/netinet X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Oct 2011 16:39:03 -0000 Author: andre Date: Fri Oct 7 16:39:03 2011 New Revision: 226113 URL: http://svn.freebsd.org/changeset/base/226113 Log: Prevent TCP sessions from stalling indefinitely in reassembly when reaching the zone limit of reassembly queue entries. When the zone limit was reached not even the missing segment that would complete the sequence space could be processed preventing the TCP session forever from making any further progress. Solve this deadlock by using a temporary on-stack queue entry for the missing segment followed by an immediate dequeue again by delivering the contiguous sequence space to the socket. Add logging under net.inet.tcp.log_debug for reassembly queue issues. Reviewed by: lsteward (previous version) Tested by: Steven Hartland MFC after: 3 days Modified: head/sys/netinet/tcp_reass.c Modified: head/sys/netinet/tcp_reass.c ============================================================================== --- head/sys/netinet/tcp_reass.c Fri Oct 7 16:09:44 2011 (r226112) +++ head/sys/netinet/tcp_reass.c Fri Oct 7 16:39:03 2011 (r226113) @@ -177,7 +177,9 @@ tcp_reass(struct tcpcb *tp, struct tcphd struct tseg_qent *nq; struct tseg_qent *te = NULL; struct socket *so = tp->t_inpcb->inp_socket; + char *s = NULL; int flags; + struct tseg_qent tqs; INP_WLOCK_ASSERT(tp->t_inpcb); @@ -215,19 +217,40 @@ tcp_reass(struct tcpcb *tp, struct tcphd TCPSTAT_INC(tcps_rcvmemdrop); m_freem(m); *tlenp = 0; + if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) { + log(LOG_DEBUG, "%s; %s: queue limit reached, " + "segment dropped\n", s, __func__); + free(s, M_TCPLOG); + } return (0); } /* * Allocate a new queue entry. If we can't, or hit the zone limit * just drop the pkt. + * + * Use a temporary structure on the stack for the missing segment + * when the zone is exhausted. Otherwise we may get stuck. */ te = uma_zalloc(V_tcp_reass_zone, M_NOWAIT); - if (te == NULL) { + if (te == NULL && th->th_seq != tp->rcv_nxt) { TCPSTAT_INC(tcps_rcvmemdrop); m_freem(m); *tlenp = 0; + if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) { + log(LOG_DEBUG, "%s; %s: global zone limit reached, " + "segment dropped\n", s, __func__); + free(s, M_TCPLOG); + } return (0); + } else if (th->th_seq == tp->rcv_nxt) { + bzero(&tqs, sizeof(struct tseg_qent)); + te = &tqs; + if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) { + log(LOG_DEBUG, "%s; %s: global zone limit reached, " + "using stack for missing segment\n", s, __func__); + free(s, M_TCPLOG); + } } tp->t_segqlen++; @@ -304,6 +327,8 @@ tcp_reass(struct tcpcb *tp, struct tcphd if (p == NULL) { LIST_INSERT_HEAD(&tp->t_segq, te, tqe_q); } else { + KASSERT(te != &tqs, ("%s: temporary stack based entry not " + "first element in queue", __func__)); LIST_INSERT_AFTER(p, te, tqe_q); } @@ -327,7 +352,8 @@ present: m_freem(q->tqe_m); else sbappendstream_locked(&so->so_rcv, q->tqe_m); - uma_zfree(V_tcp_reass_zone, q); + if (q != &tqs) + uma_zfree(V_tcp_reass_zone, q); tp->t_segqlen--; q = nq; } while (q && q->tqe_th->th_seq == tp->rcv_nxt);