From owner-freebsd-net@FreeBSD.ORG  Tue Apr  9 18:44:48 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id CD965D99
 for <freebsd-net@freebsd.org>; Tue,  9 Apr 2013 18:44:48 +0000 (UTC)
 (envelope-from pho@holm.cc)
Received: from relay00.pair.com (relay00.pair.com [209.68.5.9])
 by mx1.freebsd.org (Postfix) with SMTP id 7D40B1EA
 for <freebsd-net@freebsd.org>; Tue,  9 Apr 2013 18:44:48 +0000 (UTC)
Received: (qmail 73191 invoked from network); 9 Apr 2013 18:44:40 -0000
Received: from 87.58.146.155 (HELO x2.osted.lan) (87.58.146.155)
 by relay00.pair.com with SMTP; 9 Apr 2013 18:44:40 -0000
X-pair-Authenticated: 87.58.146.155
Received: from x2.osted.lan (localhost [127.0.0.1])
 by x2.osted.lan (8.14.5/8.14.5) with ESMTP id r39Iie5b062980;
 Tue, 9 Apr 2013 20:44:40 +0200 (CEST)
 (envelope-from pho@x2.osted.lan)
Received: (from pho@localhost)
 by x2.osted.lan (8.14.5/8.14.5/Submit) id r39Iid2W062979;
 Tue, 9 Apr 2013 20:44:39 +0200 (CEST) (envelope-from pho)
Date: Tue, 9 Apr 2013 20:44:39 +0200
From: Peter Holm <peter@holm.cc>
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: panic in tcp_do_segment()
Message-ID: <20130409184439.GA62925@x2.osted.lan>
References: <1043692819.529554.1365114790772.JavaMail.root@erie.cs.uoguelph.ca>
 <CAFc6gu9bmJQ-Uxv7PzspSLDyFz6zE3fV=u7HEumCh6f=YQLU2Q@mail.gmail.com>
 <5162B474.6060808@freebsd.org>
 <20130409081652.GA50498@x2.osted.lan>
 <5163D2D2.2090407@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5163D2D2.2090407@freebsd.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: FreeBSD Net <freebsd-net@freebsd.org>, Juan Mojica <jmojica@gmail.com>,
 Matt Miller <matt@matthewjmiller.net>, Rick Macklem <rmacklem@uoguelph.ca>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Apr 2013 18:44:48 -0000

On Tue, Apr 09, 2013 at 10:35:30AM +0200, Andre Oppermann wrote:
> On 09.04.2013 10:16, Peter Holm wrote:
> > On Mon, Apr 08, 2013 at 02:13:40PM +0200, Andre Oppermann wrote:
> >> On 05.04.2013 13:09, Matt Miller wrote:
> >>> Hey Rick,
> >>>
> >>> I believe Juan and I have root caused this crash recently.  The t_state =
> >>> 0x1, TCPS_LISTEN, in the link provided at the time of the assertion.
> >>>
> >>> In tcp_input(), if we're in TCPS_LISTEN, SO_ACCEPTCONN should be set on the
> >>> socket and we should never enter tcp_do_segment() for this state.  I think
> >>> if you look in your corefile, you'll see the socket *doesn't* have this
> >>> flag set in your case.
> >>>
> >>> 1043         /*
> >>> 1044          * When the socket is accepting connections (the INPCB is in
> >>> LISTEN
> >>> 1045          * state) we look into the SYN cache if this is a new
> >>> connection
> >>> 1046          * attempt or the completion of a previous one.  Because listen
> >>> 1047          * sockets are never in TCPS_ESTABLISHED, the V_tcbinfo lock
> >>> will be
> >>> 1048          * held in this case.
> >>> 1049          */
> >>> 1050         if (so->so_options & SO_ACCEPTCONN) {
> >>> 1051                 struct in_conninfo inc;
> >>> 1052
> >>> 1053                 KASSERT(tp->t_state == TCPS_LISTEN, ("%s: so accepting
> >>> but "
> >>> 1054                     "tp not listening", __func__));
> >>> ...
> >>> 1356                 syncache_add(&inc, &to, th, inp, &so, m, NULL, NULL);
> >>> 1357                 /*
> >>> 1358                  * Entry added to syncache and mbuf consumed.
> >>> 1359                  * Everything already unlocked by syncache_add().
> >>> 1360                  */
> >>> 1361                 INP_INFO_UNLOCK_ASSERT(&V_tcbinfo);
> >>> 1362                 return;
> >>> 1363         }
> >>> ...
> >>> 1384         /*
> >>> 1385          * Segment belongs to a connection in SYN_SENT, ESTABLISHED or
> >>> later
> >>> 1386          * state.  tcp_do_segment() always consumes the mbuf chain,
> >>> unlocks
> >>> 1387          * the inpcb, and unlocks pcbinfo.
> >>> 1388          */
> >>> 1389         tcp_do_segment(m, th, so, tp, drop_hdrlen, tlen, iptos,
> >>> ti_locked);
> >>>
> >>> I think this has to do with this patch in soclose() where SO_ACCEPTCONN is
> >>> being turned off in soclose().  I suspect if you look at the other threads
> >>> in your corefile, you'll see one at this point in soclose() operating on
> >>> the same socket as the one in the tcp_do_segment() thread.
> >>>
> >>> http://svnweb.freebsd.org/base?view=revision&revision=243627
> >>>
> >>>    817                 /*
> >>>    818                  * Prevent new additions to the accept queues due
> >>>    819                  * to ACCEPT_LOCK races while we are draining them.
> >>>    820                  */
> >>>    821                 so->so_options &= ~SO_ACCEPTCONN;
> >>>    822                 while ((sp = TAILQ_FIRST(&so->so_incomp)) != NULL) {
> >>>    823                         TAILQ_REMOVE(&so->so_incomp, sp, so_list);
> >>>    824                         so->so_incqlen--;
> >>>    825                         sp->so_qstate &= ~SQ_INCOMP;
> >>>    826                         sp->so_head = NULL;
> >>>    827                         ACCEPT_UNLOCK();
> >>>    828                         soabort(sp);
> >>>    829                         ACCEPT_LOCK();
> >>>    830                 }
> >>>
> >>> Juan had evaluated this code path and it seemed safe to just drop the
> >>> packet in this case:
> >>>
> >>> +     /*
> >>> +      * In closing down the socket, the SO_ACCEPTCONN flag is removed to
> >>> +      * prevent new connections from being established.  This means that
> >>> +      * any frames in that were in the midst of being processed could
> >>> +      * make it here.  Need to just drop the frame.
> >>> +      */
> >>> +     if (TCPS_LISTEN == tp->t_state) {
> >>> +         TCPSTAT_INC(tcps_rcvwhileclosing);
> >>> +         goto drop;
> >>> +     }
> >>>         KASSERT(tp->t_state > TCPS_LISTEN, ("%s: TCPS_LISTEN",
> >>>             __func__));
> >>>
> >>> Or, if there's someone more familiar with the locking in these paths, they
> >>> may be able to come up with a way to restructure the locks and logic to
> >>> close this window.
> >>
> >> Matt, Juan,
> >>
> >> excellent analysis.  I don't see a better approach to handle this
> >> under the current ACCEPT_LOCK model.
> >>
> >> Compared to your patch I'd like to handle this race earlier before
> >> we hit tcp_do_segment().
> >>
> >> Could you please review the attached patch which handles it right
> >> after the SO_ACCEPTCONN / syncache check?
> >>
> >> --
> >> Andre
> >>
> >> Index: netinet/tcp_input.c
> >> ===================================================================
> >> --- netinet/tcp_input.c	(revision 249253)
> >> +++ netinet/tcp_input.c	(working copy)
> >> @@ -1351,6 +1351,16 @@
> >>    		 */
> >>    		INP_INFO_UNLOCK_ASSERT(&V_tcbinfo);
> >>    		return;
> >> +	} else if (tp->t_state == TCPS_LISTEN) {
> >> +		/*
> >> +		 * When a listen socket is torn down the SO_ACCEPTCONN
> >> +		 * flag is removed first while connections are drained
> >> +		 * from the accept queue in a unlock/lock cycle of the
> >> +		 * ACCEPT_LOCK, opening a race condition allowing a SYN
> >> +		 * attempt go through unhandled.
> >> +		 */
> >> +		TCPSTAT_INC(tcps_rcvdwhileclosing);
> >> +		goto drop;
> >>    	}
> >>
> >>    #ifdef TCP_SIGNATURE
> >
> > I was able to reproduce the original "panic: tcp_do_segment:
> > TCPS_LISTEN" with ease; see
> > http://people.freebsd.org/~pho/stress/log/tcp.txt.
> >
> > With your patch (minus the TCPSTAT_INC) I got this "panic: Lock (rw)
> > tcp locked @ netinet/tcp_input.c:1432."
> >
> > http://people.freebsd.org/~pho/stress/log/tcp2.txt
> 
> Please replace the 'goto drop' with 'goto dropunlock' to fix the panic.
> 

Yes, this seems to take care of the two problems reported. Tested for
5 hours without any repeats.

- Peter