From owner-freebsd-arch@FreeBSD.ORG Thu Jan 9 19:43:04 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 34199AD0; Thu, 9 Jan 2014 19:43:04 +0000 (UTC) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) by mx1.freebsd.org (Postfix) with ESMTP id DF9F21719; Thu, 9 Jan 2014 19:43:03 +0000 (UTC) Received: from slw by zxy.spb.ru with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1W1LVN-000Ku5-AJ; Thu, 09 Jan 2014 23:43:01 +0400 Date: Thu, 9 Jan 2014 23:43:01 +0400 From: Slawa Olhovchenkov To: Adrian Chadd Subject: Re: Acquiring a lock on the same CPU that holds it - what can be done? Message-ID: <20140109194301.GA79282@zxy.spb.ru> References: <9508909.MMfryVDtI5@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jan 2014 19:43:04 -0000 On Thu, Jan 09, 2014 at 10:44:51AM -0800, Adrian Chadd wrote: > On 9 January 2014 10:31, John Baldwin wrote: > > On Friday, January 03, 2014 04:55:48 PM Adrian Chadd wrote: > >> Hi, > >> > >> So here's a fun one. > >> > >> When doing TCP traffic + socket affinity + thread pinning experiments, > >> I seem to hit this very annoying scenario that caps my performance and > >> scalability. > >> > >> Assume I've lined up everything relating to a socket to run on the > >> same CPU (ie, TX, RX, TCP timers, userland thread): > > > > Are you sure this is really the best setup? Especially if you have free CPUs > > in the system the time you lose in context switches fighting over the one > > assigned CPU for a flow when you have idle CPUs is quite wasteful. I know > > that tying all of the work for a given flow to a single CPU is all the rage > > right now, but I wonder if you had considered assigning a pair of CPUs to a > > flow, one CPU to do the top-half (TX and userland thread) and one CPU to > > do the bottom-half (RX and timers). This would remove the context switches > > you see and replace it with spinning in the times when the two cores actually > > contend. It may also be fairly well suited to SMT (which I suspect you might > > have turned off currently). If you do have SMT turned off, then you can get > > a pair of CPUs for each queue without having to reduce the number of queues > > you are using. I'm not sure this would work better than creating one queue > > for every CPU, but I think it is probably something worth trying for your use > > case at least. > > > > BTW, the problem with just slapping critical enter into mutexes is you will > > run afoul of assertions the first time you contend on a mutex and have to > > block. It may be that only the assertions would break and nothing else, but > > I'm not certain there aren't other assumptions about critical sections and > > not ever context switching for any reason, voluntary or otherwise. > > It's the rage because it turns out it bounds the system behaviour rather nicely. > > The idea is to scale upwards of 60,000 active TCP sockets. Some people > are looking at upwards of 100,000 active concurrent sockets. The > amount of contention is non-trivial if it's not lined up. > > And yeah, I'm aware of the problem of just slapping critical sections > around mutexes. I've faced this stuff in Linux. It's why doing this > stuff is much more fragile on Linux.. :-P For this setup first look to TCP timers (and locking around tcp timers), IMHO.