From owner-freebsd-net@FreeBSD.ORG Mon Apr 8 18:57:59 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 3B4EF60D; Mon, 8 Apr 2013 18:57:59 +0000 (UTC) (envelope-from mattmiller1@gmail.com) Received: from mail-lb0-f174.google.com (mail-lb0-f174.google.com [209.85.217.174]) by mx1.freebsd.org (Postfix) with ESMTP id 718151AD; Mon, 8 Apr 2013 18:57:57 +0000 (UTC) Received: by mail-lb0-f174.google.com with SMTP id s10so6086718lbi.19 for ; Mon, 08 Apr 2013 11:57:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=q4JFtqxXZRPftybh+h3id/kWWTI87V+s4XAdenAJfZI=; b=Hx9aj9kwkhPBKp2MMo6vJNZD5cr9GYaRWGP/FSHj1ZuDeaPWTxizibSIPIrWY2yTKk V+ZFMqFCsopqmzI8M+yX/2+fF8nWA9sjlm4tpjQj4N0RXKrTgQNOgRKF2C0nJkGe9qpC DjBlnwlVwtWnuEYFnpapHZMaAjNg06oe+MOGoXaQNT6gZTCan/OZUer2v8BFpipQ0+C9 vyex1R2uscnA1nYrWG5j0D5l6gZc5Zm2Rd/h6wzjjlbAiwTs8J+DoV3pGUQhcpCwuyG3 dy3kuK4QBBMw5M3xCIOpHTNADMOFWZ741iNjN9sNRGNWdODAxTmdiYBNWu1Vb59hJ2we w7Zg== X-Received: by 10.152.110.74 with SMTP id hy10mr12252235lab.37.1365447476203; Mon, 08 Apr 2013 11:57:56 -0700 (PDT) MIME-Version: 1.0 Sender: mattmiller1@gmail.com Received: by 10.112.101.166 with HTTP; Mon, 8 Apr 2013 11:57:16 -0700 (PDT) In-Reply-To: <5162B474.6060808@freebsd.org> References: <1043692819.529554.1365114790772.JavaMail.root@erie.cs.uoguelph.ca> <5162B474.6060808@freebsd.org> From: Matt Miller Date: Mon, 8 Apr 2013 14:57:16 -0400 X-Google-Sender-Auth: 12sfr5BB-3C4cDVQ3tcMWCZgkmw Message-ID: Subject: Re: panic in tcp_do_segment() To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD Net , Juan Mojica , Rick Macklem X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 18:57:59 -0000 Hey Andre, Looks OK to me. Note that tcps_rcvdwhileclosing is a new stat Juan added (I had omitted the rest of that diff b/c adding a new stat wasn't that interesting to the discussion). If you want that stat added, we can prepare a patch with the rest of the changes (or, I'm sure you know already know what else needs to be done for that). Just let us know if you want us to prepare the rest of the patch for that new stat or not. Thanks, Matt On Mon, Apr 8, 2013 at 8:13 AM, Andre Oppermann wrote: > On 05.04.2013 13:09, Matt Miller wrote: > >> Hey Rick, >> >> I believe Juan and I have root caused this crash recently. The t_state = >> 0x1, TCPS_LISTEN, in the link provided at the time of the assertion. >> >> In tcp_input(), if we're in TCPS_LISTEN, SO_ACCEPTCONN should be set on >> the >> socket and we should never enter tcp_do_segment() for this state. I think >> if you look in your corefile, you'll see the socket *doesn't* have this >> flag set in your case. >> >> 1043 /* >> 1044 * When the socket is accepting connections (the INPCB is in >> LISTEN >> 1045 * state) we look into the SYN cache if this is a new >> connection >> 1046 * attempt or the completion of a previous one. Because >> listen >> 1047 * sockets are never in TCPS_ESTABLISHED, the V_tcbinfo lock >> will be >> 1048 * held in this case. >> 1049 */ >> 1050 if (so->so_options & SO_ACCEPTCONN) { >> 1051 struct in_conninfo inc; >> 1052 >> 1053 KASSERT(tp->t_state == TCPS_LISTEN, ("%s: so >> accepting >> but " >> 1054 "tp not listening", __func__)); >> ... >> 1356 syncache_add(&inc, &to, th, inp, &so, m, NULL, NULL); >> 1357 /* >> 1358 * Entry added to syncache and mbuf consumed. >> 1359 * Everything already unlocked by syncache_add(). >> 1360 */ >> 1361 INP_INFO_UNLOCK_ASSERT(&V_**tcbinfo); >> 1362 return; >> 1363 } >> ... >> 1384 /* >> 1385 * Segment belongs to a connection in SYN_SENT, ESTABLISHED >> or >> later >> 1386 * state. tcp_do_segment() always consumes the mbuf chain, >> unlocks >> 1387 * the inpcb, and unlocks pcbinfo. >> 1388 */ >> 1389 tcp_do_segment(m, th, so, tp, drop_hdrlen, tlen, iptos, >> ti_locked); >> >> I think this has to do with this patch in soclose() where SO_ACCEPTCONN is >> being turned off in soclose(). I suspect if you look at the other threads >> in your corefile, you'll see one at this point in soclose() operating on >> the same socket as the one in the tcp_do_segment() thread. >> >> http://svnweb.freebsd.org/**base?view=revision&revision=**243627 >> >> 817 /* >> 818 * Prevent new additions to the accept queues due >> 819 * to ACCEPT_LOCK races while we are draining them. >> 820 */ >> 821 so->so_options &= ~SO_ACCEPTCONN; >> 822 while ((sp = TAILQ_FIRST(&so->so_incomp)) != NULL) { >> 823 TAILQ_REMOVE(&so->so_incomp, sp, so_list); >> 824 so->so_incqlen--; >> 825 sp->so_qstate &= ~SQ_INCOMP; >> 826 sp->so_head = NULL; >> 827 ACCEPT_UNLOCK(); >> 828 soabort(sp); >> 829 ACCEPT_LOCK(); >> 830 } >> >> Juan had evaluated this code path and it seemed safe to just drop the >> packet in this case: >> >> + /* >> + * In closing down the socket, the SO_ACCEPTCONN flag is removed to >> + * prevent new connections from being established. This means that >> + * any frames in that were in the midst of being processed could >> + * make it here. Need to just drop the frame. >> + */ >> + if (TCPS_LISTEN == tp->t_state) { >> + TCPSTAT_INC(tcps_**rcvwhileclosing); >> + goto drop; >> + } >> KASSERT(tp->t_state > TCPS_LISTEN, ("%s: TCPS_LISTEN", >> __func__)); >> >> Or, if there's someone more familiar with the locking in these paths, they >> may be able to come up with a way to restructure the locks and logic to >> close this window. >> > > Matt, Juan, > > excellent analysis. I don't see a better approach to handle this > under the current ACCEPT_LOCK model. > > Compared to your patch I'd like to handle this race earlier before > we hit tcp_do_segment(). > > Could you please review the attached patch which handles it right > after the SO_ACCEPTCONN / syncache check? > > -- > Andre > > Index: netinet/tcp_input.c > ==============================**==============================**======= > --- netinet/tcp_input.c (revision 249253) > +++ netinet/tcp_input.c (working copy) > @@ -1351,6 +1351,16 @@ > */ > INP_INFO_UNLOCK_ASSERT(&V_**tcbinfo); > return; > + } else if (tp->t_state == TCPS_LISTEN) { > + /* > + * When a listen socket is torn down the SO_ACCEPTCONN > + * flag is removed first while connections are drained > + * from the accept queue in a unlock/lock cycle of the > + * ACCEPT_LOCK, opening a race condition allowing a SYN > + * attempt go through unhandled. > + */ > + TCPSTAT_INC(tcps_**rcvdwhileclosing); > + goto drop; > } > > #ifdef TCP_SIGNATURE >