From owner-freebsd-net@FreeBSD.ORG Sun Apr 7 21:28:56 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 418D9445 for ; Sun, 7 Apr 2013 21:28:56 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id EA34BD7 for ; Sun, 7 Apr 2013 21:28:55 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqMEAJbjYVGDaFvO/2dsb2JhbABRFoMmgyq9eoEYdIIfAQEBAwEBAQEgBCcgCxkCDgoCAg0ZAiMGAQkmBggHBAEcBIdhAwkGDJBKmwSHaQ2JXYEjiy2BERB+ATMHgi6BEwOKfIk5XoFhgSGIVYF9hRuDJyAygQU1 X-IronPort-AV: E=Sophos;i="4.87,427,1363147200"; d="scan'208";a="24822315" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 07 Apr 2013 17:28:49 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 1796FB3F4F; Sun, 7 Apr 2013 17:28:49 -0400 (EDT) Date: Sun, 7 Apr 2013 17:28:49 -0400 (EDT) From: Rick Macklem To: Juan Mojica Message-ID: <889005941.599737.1365370129081.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: Subject: Re: panic in tcp_do_segment() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE8 (Win)/6.0.10_GA_2692) Cc: FreeBSD Net , Matt Miller X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Apr 2013 21:28:56 -0000 Juan Mojica wrote: > Agree with Matt. >=20 > Whenever there is an UNLOCK/LOCK like is present in soclose(), there > is a window to allow something through.=C2=A0 Unsetting SO_ACCEPTCONN was > put in place because the LOCK/UNLOCK in soclose let a new socket to be > added to the so_incomp list causing a different ASSERT to be hit - and > memory to be leaked. >=20 > As far as I can tell, because of the upcall performed in soabort(), we > can't just hold the ACCEPT lock all the way through the close. >=20 >=20 > The simplest thing to do in this case seemed to be to just drop the > TCP segment - the connection is being closed anyway.=C2=A0 Or like Matt > said, having someone look at the lock logic and see if there is > something there that can be exploited to prevent this would also help. >=20 >=20 > -Juan >=20 >=20 >=20 >=20 > On Fri, Apr 5, 2013 at 7:09 AM, Matt Miller < matt@matthewjmiller.net > > wrote: >=20 >=20 >=20 >=20 > Hey Rick, >=20 > I believe Juan and I have root caused this crash recently.=C2=A0 The > t_state =3D 0x1, TCPS_LISTEN, in the link provided at the time of the > assertion. >=20 > In tcp_input(), if we're in TCPS_LISTEN, SO_ACCEPTCONN should be set > on the socket and we should never enter tcp_do_segment() for this > state.=C2=A0 I think if you look in your corefile, you'll see the socket > *doesn't* have this flag set in your case. >=20 Thanks guys. I had missed the *tp near the end and mistakenly was thinking it was the client socket. This sounds like a good explanation to me. Hopefully, one of the tcp stack guys will pick this up and commit a patch, rick. >=20 > 1043=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > 1044=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * When the soc= ket is accepting connections (the INPCB is > in LISTEN > 1045=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * state) we lo= ok into the SYN cache if this is a new > connection > 1046=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * attempt or t= he completion of a previous one.=C2=A0 Because > listen > 1047=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * sockets are = never in TCPS_ESTABLISHED, the V_tcbinfo > lock will be > 1048=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * held in this= case. > 1049=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > 1050=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (so->so_options &= SO_ACCEPTCONN) { > 1051=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 struct in_conninfo inc; > 1052 > 1053=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 KASSERT(tp->t_state =3D=3D TCPS_LISTEN, ("%s: s= o > accepting but " > 1054=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "tp not listening", __f= unc__)); > ... > 1356=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 syncache_add(&inc, &to, th, inp, &so, m, NULL, > NULL); > 1357=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > 1358=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * Entry added to syncache and mbuf consum= ed. > 1359=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * Everything already unlocked by syncache= _add(). > 1360=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > 1361=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 INP_INFO_UNLOCK_ASSERT(&V_tcbinfo); > 1362=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 return; > 1363=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } > ... > 1384=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > 1385=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * Segment belo= ngs to a connection in SYN_SENT, > ESTABLISHED or later > 1386=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * state.=C2=A0= tcp_do_segment() always consumes the mbuf > chain, unlocks > 1387=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * the inpcb, a= nd unlocks pcbinfo. > 1388=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > 1389=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tcp_do_segment(m, th= , so, tp, drop_hdrlen, tlen, iptos, > ti_locked); >=20 >=20 > I think this has to do with this patch in soclose() where > SO_ACCEPTCONN is being turned off in soclose().=C2=A0 I suspect if you lo= ok > at the other threads in your corefile, you'll see one at this point in > soclose() operating on the same socket as the one in the > tcp_do_segment() thread. >=20 >=20 > http://svnweb.freebsd.org/base?view=3Drevision&revision=3D243627 >=20 > =C2=A0817=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > =C2=A0818=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * Prevent new additions to the acce= pt queues due > =C2=A0819=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * to ACCEPT_LOCK races while we are= draining > them. > =C2=A0820=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > =C2=A0821=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 so->so_options &=3D ~SO_ACCEPTCONN; > =C2=A0822=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 while ((sp =3D TAILQ_FIRST(&so->so_incomp= )) !=3D > NULL) { > =C2=A0823=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 TAILQ_REMOVE(&so->so_incomp, sp, > so_list); > =C2=A0824=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 so->so_incqlen--; > =C2=A0825=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 sp->so_qstate &=3D ~SQ_INCOMP; > =C2=A0826=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 sp->so_head =3D NULL; > =C2=A0827=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 ACCEPT_UNLOCK(); > =C2=A0828=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 soabort(sp); > =C2=A0829=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 ACCEPT_LOCK(); > =C2=A0830=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >=20 >=20 > Juan had evaluated this code path and it seemed safe to just drop the > packet in this case: >=20 >=20 > + =C2=A0=C2=A0=C2=A0 /* > + =C2=A0=C2=A0=C2=A0 =C2=A0* In closing down the socket, the SO_ACCEPTCON= N flag is removed > to > + =C2=A0=C2=A0=C2=A0 =C2=A0* prevent new connections from being establish= ed.=C2=A0 This means > that > + =C2=A0=C2=A0=C2=A0 =C2=A0* any frames in that were in the midst of bein= g processed could > + =C2=A0=C2=A0=C2=A0 =C2=A0* make it here.=C2=A0 Need to just drop the fr= ame. > + =C2=A0=C2=A0=C2=A0 =C2=A0*/ > + =C2=A0=C2=A0=C2=A0 if (TCPS_LISTEN =3D=3D tp->t_state) { > + =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 TCPSTAT_INC(tcps_rcvwhileclosing)= ; > + =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 goto drop; > + =C2=A0=C2=A0=C2=A0 } > =C2=A0 =C2=A0=C2=A0=C2=A0 KASSERT(tp->t_state > TCPS_LISTEN, ("%s: TCPS_L= ISTEN", > =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 __func__)); >=20 >=20 > Or, if there's someone more familiar with the locking in these paths, > they may be able to come up with a way to restructure the locks and > logic to close this window. >=20 >=20 >=20 > -Matt >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 > On Thu, Apr 4, 2013 at 6:33 PM, Rick Macklem < rmacklem@uoguelph.ca > > wrote: >=20 >=20 > Hi, >=20 > When pho@ was doing some NFS testing, he got the > following crash, which I can't figure out. (As far > as I can see, INP_WLOCK() is always held when > tp->t_state =3D TCPS_CLOSED and it is held from before > the test for TCPS_CLOSED in tcp_input() up until > the tcp_do_segment() call. As such, I don't see how > tp->t_state can be TCPS_CLOSED, but that seems to > be what causes the panic?) >=20 > The "umount -f" will result in: > =C2=A0 soshutdown(so, SHUT_WR); > =C2=A0 soclose(so); > being done by the krpc on the socket. >=20 > Anyone have any ideas on this? > pho@ wrote: > > I continued running the NFS tests and got this "panic: > > tcp_do_segment: > > TCPS_LISTEN". It's the second time I get this panic with the same > > test > > scenario, so it seems to be reproducible. The scenario is "umount > > -f" > > of a mount point that is very active. > > > > http://people.freebsd.org/~pho/stress/log/kostik555.txt >=20 > Thanks in advance for any help, rick > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to " freebsd-net-unsubscribe@freebsd.org > " >=20 >=20 >=20 >=20 > -- > Juan Mojica > Email: jmojica@gmail.com