From owner-freebsd-net@FreeBSD.ORG Sun Jul 22 17:55:15 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 63B72106564A for ; Sun, 22 Jul 2012 17:55:15 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 1C1878FC0A for ; Sun, 22 Jul 2012 17:55:15 +0000 (UTC) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTPS id 71B0E46B0D; Sun, 22 Jul 2012 13:55:14 -0400 (EDT) Date: Sun, 22 Jul 2012 18:55:14 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Vijay Singh In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org Subject: Re: Use of V_tcbinfo lock in tcp_do_segment() X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jul 2012 17:55:15 -0000 On Thu, 24 May 2012, Vijay Singh wrote: > Folks, I am trying to understand the need to hold the V_tcbinfo lock in part > of this function [code included below for reference]. > > At present this causes the socket upcall to be called with the V_tcbinfo > lock held, which I'd like to avoid. We release this lock right after this > section. > > Looking at the code, it seems the lock is needed if we were in FIN_WAIT_2 > and tcp_close() the connection. The lock also seems to be protecting > V_twq_2msl. > > Would it be an acceptable solution if we deferred calling socantrecvmore() > till after the lock can be dropped (after the swtich statement). > tcp_twstart() can be changed to return tp if the connections survives, or > NULL if it doesn't, much like what tcp_close() does. Also a new lock could > be added to protect the V_rwq_2msl queue. > > If this sounds acceptable, I can generate a patch against -CURRENT. I would > appreciate feedback. Hi Vijay: Sorry for responding to your query with such a delay. A quick glance at tcp_twstart() appears to confirm your description -- I read it as requiring the incpbinfo lock for several reasons: 1) Potential call to tcp_close(). 2) Potential call to tcp_tw_2msl_scan() which may in turn call tcp_twclose(), which requires the pcbinfo lock (see 5 below). 3) Direct use of V_twq_2msl. 4) Potential call to tcp_tw_2msl_scan(), which uses V_twq_2msl, and may also call tcp_tw_close (see 5). 5) In the tcp_twclose() callgraph subtree, the pcbinfo lock is used for several things, including tcp_tw_2msl_stop() and in_pcbfree(). The call to sofree() may recurse, under some circumstances, back into pru_detach(), and then into tcp_detach() which requires the lock. Especially with the pcbgroup reworking so that the pcbhash rather than pcbinfo lock is used in a number of places, there are increasing numbers of places where this sort of optimisation could be made. One caution to ponder: socantrecvmore() can trigger an upcall (per your comment) in the receive path. It used to be that upcalls did worrying things, such as call back down the stack -- I think we've largely eliminated those (NFS UDP was one such example, now fixed?) and therefore they are probably OK, but worth checking the so_upcall consumers to see what the implications might be, if any. Robert > > -vijay > > > relevant code from tcp_do_segment(): > > if (thflags & TH_FIN) { > if (TCPS_HAVERCVDFIN(tp->t_state) == 0) { > socantrcvmore(so); > /* > * If connection is half-synchronized > * (ie NEEDSYN flag on) then delay ACK, > * so it may be piggybacked when SYN is sent. > * Otherwise, since we received a FIN then no > * more input can be expected, send ACK now. > */ > if (tp->t_flags & TF_NEEDSYN) > tp->t_flags |= TF_DELACK; > else > tp->t_flags |= TF_ACKNOW; > tp->rcv_nxt++; > } > switch (tp->t_state) { > > /* > * In SYN_RECEIVED and ESTABLISHED STATES > * enter the CLOSE_WAIT state. > */ > case TCPS_SYN_RECEIVED: > tp->t_starttime = ticks; > /* FALLTHROUGH */ > case TCPS_ESTABLISHED: > tp->t_state = TCPS_CLOSE_WAIT; > break; > > /* > * If still in FIN_WAIT_1 STATE FIN has not been acked so > * enter the CLOSING state. > */ > case TCPS_FIN_WAIT_1: > tp->t_state = TCPS_CLOSING; > break; > > /* > * In FIN_WAIT_2 state enter the TIME_WAIT state, > * starting the time-wait timer, turning off the other > * standard timers. > */ > case TCPS_FIN_WAIT_2: > INP_INFO_WLOCK_ASSERT(&V_tcbinfo); > KASSERT(ti_locked == TI_WLOCKED, ("%s: dodata " > "TCP_FIN_WAIT_2 ti_locked: %d", __func__, > ti_locked)); > > tcp_twstart(tp); > INP_INFO_WUNLOCK(&V_tcbinfo); > return; > } > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >