Date: Thu, 07 Nov 2013 14:55:22 +0100 From: "Julien Charbon" <jcharbon@verisign.com> To: freebsd-net@freebsd.org Subject: Re: TCP stack lock contention with short-lived connections Message-ID: <op.w56mamc0ak5tgc@dul1rjacobso-l3.vcorp.ad.vrsn.com> References: <op.w51mxed6ak5tgc@fri2jcharbon-m1.local>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi list, On Mon, 04 Nov 2013 22:21:04 +0100, Julien Charbon <jcharbon@verisign.com> wrote: > just a follow-up of vBSDCon discussions about FreeBSD TCP performances > with short-lived connections. In summary: <snip> > > I have put technical and how-to-repeat details in below PR: > > kern/183659: TCP stack lock contention with short-lived connections > http://www.freebsd.org/cgi/query-pr.cgi?pr=183659 > > We are currently working on this performance improvement effort; it > will impact only the TCP locking strategy not the TCP stack logic > itself. We will share on freebsd-net the patches we made for reviewing > and improvement propositions; anyway this change might also require > enough eyeballs to avoid tricky race conditions introduction in TCP > stack. Just a follow-up: We are currently removing TCP INP_INFO lock from places it is actually not required in order to mitigate the lock contention. It seems to be a good first step in this effort: Small changes, easy to review, low risk (and small gain... right). Below a first patch that removes INP_INFO lock from tcp_usr_accept(): This changes simply follows the advice made in corresponding code comment: "A better fix would prevent the socket from being placed in the listen queue until all fields are fully initialized." For more technical details, check the comment in related change below: http://svnweb.freebsd.org/base?view=revision&revision=175612 With this patch applied we see no regressions and a performance improvement of ~5% i.e with 9.2 vanilla kernel: 52k TCP Queries Per Second, with 9.2 + joined patch: 55k TCP QPS. Not huge indeed but still an improvement. P.S.: Funny enough it seems that the same change has already been proposed in the past: http://lists.freebsd.org/pipermail/freebsd-net/2013-January/034261.html -- Julien From: Julien Charbon <jcharbon@verisign.com> Subject: [PATCH] Add new socket in listen queue only when fully initialized --- sys/netinet/tcp_syncache.c | 4 +++- sys/netinet/tcp_usrreq.c | 9 --------- 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/sys/netinet/tcp_syncache.c b/sys/netinet/tcp_syncache.c index af1651a..eb73356 100644 --- a/sys/netinet/tcp_syncache.c +++ b/sys/netinet/tcp_syncache.c @@ -660,7 +660,7 @@ syncache_socket(struct syncache *sc, struct socket *lso, struct mbuf *m) * connection when the SYN arrived. If we can't create * the connection, abort it. */ - so = sonewconn(lso, SS_ISCONNECTED); + so = sonewconn(lso, 0); if (so == NULL) { /* * Drop the connection; we will either send a RST or @@ -890,6 +890,8 @@ syncache_socket(struct syncache *sc, struct socket *lso, struct mbuf *m) INP_WUNLOCK(inp); + soisconnected(so); + TCPSTAT_INC(tcps_accepts); return (so); diff --git a/sys/netinet/tcp_usrreq.c b/sys/netinet/tcp_usrreq.c index b83f34a..566cc34 100644 --- a/sys/netinet/tcp_usrreq.c +++ b/sys/netinet/tcp_usrreq.c @@ -609,13 +609,6 @@ out: /* * Accept a connection. Essentially all the work is done at higher levels; * just return the address of the peer, storing through addr. - * - * The rationale for acquiring the tcbinfo lock here is somewhat complicated, - * and is described in detail in the commit log entry for r175612. Acquiring - * it delays an accept(2) racing with sonewconn(), which inserts the socket - * before the inpcb address/port fields are initialized. A better fix would - * prevent the socket from being placed in the listen queue until all fields - * are fully initialized. */ static int tcp_usr_accept(struct socket *so, struct sockaddr **nam) @@ -632,7 +625,6 @@ tcp_usr_accept(struct socket *so, struct sockaddr **nam) inp = sotoinpcb(so); KASSERT(inp != NULL, ("tcp_usr_accept: inp == NULL")); - INP_INFO_RLOCK(&V_tcbinfo); INP_WLOCK(inp); if (inp->inp_flags & (INP_TIMEWAIT | INP_DROPPED)) { error = ECONNABORTED; @@ -652,7 +644,6 @@ tcp_usr_accept(struct socket *so, struct sockaddr **nam) out: TCPDEBUG2(PRU_ACCEPT); INP_WUNLOCK(inp); - INP_INFO_RUNLOCK(&V_tcbinfo); if (error == 0) *nam = in_sockaddr(port, &addr); return error;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.w56mamc0ak5tgc>