From owner-freebsd-arch@FreeBSD.ORG  Wed Feb 19 23:07:21 2014
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A26CCD34;
 Wed, 19 Feb 2014 23:07:21 +0000 (UTC)
Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98])
 by mx1.freebsd.org (Postfix) with ESMTP id 4D1621E1E;
 Wed, 19 Feb 2014 23:07:21 +0000 (UTC)
Received: from slw by zxy.spb.ru with local (Exim 4.69 (FreeBSD))
 (envelope-from <slw@zxy.spb.ru>)
 id 1WGGEZ-000Fl4-Uw; Thu, 20 Feb 2014 03:07:19 +0400
Date: Thu, 20 Feb 2014 03:07:19 +0400
From: Slawa Olhovchenkov <slw@zxy.spb.ru>
To: Alexander Motin <mav@FreeBSD.org>
Subject: Re: [rfc] bind per-cpu timeout threads to each CPU
Message-ID: <20140219230719.GM83358@zxy.spb.ru>
References: <530508B7.7060102@FreeBSD.org>
 <CAJ-VmokQ_C=YVpk41_r-QakB46_RWRe0didq1_RrZBMS7hDX-A@mail.gmail.com>
 <53050D24.3020505@FreeBSD.org>
 <CAJ-Vmo=KFF_2tdyq1u=jNkWfEe1sR-89t3JNggf7MEvYsF+tQg@mail.gmail.com>
 <53051C71.3050705@FreeBSD.org> <20140219214428.GA53864@zxy.spb.ru>
 <53052B80.3010505@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <53052B80.3010505@FreeBSD.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: slw@zxy.spb.ru
X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false
Cc: Adrian Chadd <adrian@freebsd.org>,
 freebsd-current <freebsd-current@freebsd.org>,
 Jeffrey Faden <jeffreyatw@gmail.com>,
 "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch/>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Feb 2014 23:07:21 -0000

On Thu, Feb 20, 2014 at 12:09:04AM +0200, Alexander Motin wrote:

> On 19.02.2014 23:44, Slawa Olhovchenkov wrote:
> > On Wed, Feb 19, 2014 at 11:04:49PM +0200, Alexander Motin wrote:
> >
> >> On 19.02.2014 22:04, Adrian Chadd wrote:
> >>> On 19 February 2014 11:59, Alexander Motin <mav@freebsd.org> wrote:
> >>>
> >>>>> So if we're moving towards supporting (among others) a pcbgroup / RSS
> >>>>> hash style work load distribution across CPUs to minimise
> >>>>> per-connection lock contention, we really don't want the scheduler to
> >>>>> decide it can schedule things on other CPUs under enough pressure.
> >>>>> That'll just make things worse.
> >>>
> >>>> True, though it is also not obvious that putting second thread on CPU run
> >>>> queue is better then executing it right now on another core.
> >>>
> >>> Well, it depends if you're trying to optimise for "run all runnable
> >>> tasks as quickly as possible" or "run all runnable tasks in contexts
> >>> that minimise lock contention."
> >>>
> >>> The former sounds great as long as there's no real lock contention
> >>> going on. But as you add more chances for contention (something like
> >>> "100,000 concurrent TCP flows") then you may end up having your TCP
> >>> timer firing stuff interfere with more TXing or RXing on the same
> >>> connection.
> >>
> >> 100K TCP flows probably means 100K locks. That means that chance of lock
> >> collision on each of them is effectively zero. More realistic it could
> >
> > What about 100K/N_cpu*PPS timer's queue locks for remove/insert TCP
> > timeouts callbacks?
> 
> I am not sure what this formula means, but yes, per-CPU callout locks 
> can much more likely be congested. They are only per-CPU, not per-flow.

100K TCP flows distributed between CPU (100K/N_cpu).
every TCP flow several times per seconds touch his callout (*PPS)