Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Mar 2014 13:57:43 -0800
From:      John-Mark Gurney <jmg@funkthat.com>
To:        Julien Charbon <jcharbon@verisign.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: TCP stack lock contention with short-lived connections
Message-ID:  <20140306215743.GB47921@funkthat.com>
In-Reply-To: <len481$sfv$2@ger.gmane.org>
References:  <op.w51mxed6ak5tgc@fri2jcharbon-m1.local> <op.w56mamc0ak5tgc@dul1rjacobso-l3.vcorp.ad.vrsn.com> <len481$sfv$2@ger.gmane.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Julien Charbon wrote this message on Thu, Feb 27, 2014 at 11:32 +0100:
>  Obviously, to mitigate this lock contention there are various solutions:
> 
>  - Introduce a new time-wait lock as proposed in joined patch
>  - Call tcp_tw_2msl_scan() more often in case of high workload
>  - Use INP_INFO_TRY_WLOCK() in tcp_tw_2msl_scan() to clean-up time-wait 
> objects only when nobody else handles INP_INFO lock
>  - Etc.
> 
>  The strategy being to prioritize packet reception over time-wait 
> objects cleaned-up as:
> 
>  - we hate dropping packet in reception when the bandwidth is far from 
> being full
>  - the maximum of used time-wait objects is configurable 
> (net.inet.tcp.maxtcptw)
>  - in case of time-wait objects memory exhaustion, the current behavior 
> is already optimal:  The oldest time-wait object is recycled and 
> directly reused.
> 
>  We picked the time-wait lock way because it suits well our long-term 
> strategy to completely mitigate the INP_INFO lock contention everywhere 
> in TCP stack.
> 
>  Any thoughts on this particular behavior?

One thing that I noticed is that you now lock/unlock the tw and inp lock a
lot...  Have you thought about grabing the TW lock once, grabbing some/all
of the ones necessary to process and then process them in a second step?
If the bulk processing is still an issue, then you could process them in
blocks of 50 or so, that way the number of lock/unlock cycles is reduced...

> +/*
> + * Drop a refcount on an tw elevated using tw_pcbref().  If it is
> + * valid, we return with the tw lock held.
> + */

I assume you mean that you return with the tw lock unlocked?  at least
that's what the code reads to me...

[...]

> +static int
> +tw_pcbrele(struct tcptw *tw)
> +{
> +	TW_WLOCK_ASSERT(V_tw_lock);
> +	KASSERT(tw->tw_refcount > 0, ("%s: refcount 0", __func__));
> +
> +	if (!refcount_release(&tw->tw_refcount)) {
> +		TW_WUNLOCK(V_tw_lock);
> +		return (0);
> +	}
> +
> +	uma_zfree(V_tcptw_zone, tw);
> +	TW_WUNLOCK(V_tw_lock);
> +	return (1);
> +}

[...]

Otherwise looks like a good patch...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140306215743.GB47921>