From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 14:41:14 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 145F0106564A; Fri, 6 Apr 2012 14:41:14 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id CD41C8FC0C; Fri, 6 Apr 2012 14:41:12 +0000 (UTC) Received: by wgbds12 with SMTP id ds12so2238046wgb.31 for ; Fri, 06 Apr 2012 07:41:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=MS2dklQoXxfnle5eL7zTe/RcCJ+1c6Y0WNgi+VB4LXA=; b=tBn1iWlNZ2hHZAEFHq+YhDNBF3YN61F/hIHOF9OwXNnrT86oJBpWlW2N7thq6lH3YA ZLo1yha1jyVltfU103zoOwq1latnFLXGd2fVPt6UygHmGeeEsX4v26SVT1hBI2BBh5iI yJ/qnCdT0csd6vxPfk7SlN8OstAO4S0IPo1iugsCHuU24fLA1JVhDtpqRW7mlGZYeSRr Qc7JsD4ynt4hDT61E9RvO3heZLJ+WLwOwZGo79h5jLllVhuDGrnB624t6lvFGPqajNSC KiB8Carg8N44XLGsP6aqkSiQPv1aiANzLAwve72rH9/5UdbvtQGqyYDD7UTlFij6ggkj sgBA== Received: by 10.180.82.136 with SMTP id i8mr12007448wiy.19.1333723271834; Fri, 06 Apr 2012 07:41:11 -0700 (PDT) Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id fn2sm11423355wib.0.2012.04.06.07.41.09 (version=SSLv3 cipher=OTHER); Fri, 06 Apr 2012 07:41:11 -0700 (PDT) Sender: Alexander Motin Message-ID: <4F7F0085.7090001@FreeBSD.org> Date: Fri, 06 Apr 2012 17:41:09 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.2) Gecko/20120226 Thunderbird/10.0.2 MIME-Version: 1.0 To: Attilio Rao References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <4F3978BC.6090608@FreeBSD.org> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <4F3E807A.60103@FreeBSD.org> <4F3E8858.4000001@FreeBSD.org> <4F7EFD42.9010507@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Florian Smeets , freebsd-hackers@freebsd.org, Andriy Gapon , FreeBSD current , Jeff Roberson , Arnaud Lacombe Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 14:41:14 -0000 On 04/06/12 17:30, Attilio Rao wrote: > Il 06 aprile 2012 15:27, Alexander Motin ha scritto: >> On 04/06/12 17:13, Attilio Rao wrote: >>> >>> Il 05 aprile 2012 19:12, Arnaud Lacombe ha scritto: >>>> >>>> Hi, >>>> >>>> [Sorry for the delay, I got a bit sidetrack'ed...] >>>> >>>> 2012/2/17 Alexander Motin: >>>>> >>>>> On 17.02.2012 18:53, Arnaud Lacombe wrote: >>>>>> >>>>>> >>>>>> On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motin >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> On 02/15/12 21:54, Jeff Roberson wrote: >>>>>>>> >>>>>>>> >>>>>>>> On Wed, 15 Feb 2012, Alexander Motin wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> I've decided to stop those cache black magic practices and focus on >>>>>>>>> things that really exist in this world -- SMT and CPU load. I've >>>>>>>>> dropped most of cache related things from the patch and made the >>>>>>>>> rest >>>>>>>>> of things more strict and predictable: >>>>>>>>> http://people.freebsd.org/~mav/sched.htt34.patch >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This looks great. I think there is value in considering the other >>>>>>>> approach further but I would like to do this part first. It would be >>>>>>>> nice to also add priority as a greater influence in the load >>>>>>>> balancing >>>>>>>> as well. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I haven't got good idea yet about balancing priorities, but I've >>>>>>> rewritten >>>>>>> balancer itself. As soon as sched_lowest() / sched_highest() are more >>>>>>> intelligent now, they allowed to remove topology traversing from the >>>>>>> balancer itself. That should fix double-swapping problem, allow to >>>>>>> keep >>>>>>> some >>>>>>> affinity while moving threads and make balancing more fair. I did >>>>>>> number >>>>>>> of >>>>>>> tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 >>>>>>> and >>>>>>> 16 >>>>>>> threads everything is stationary as it should. With 9 threads I see >>>>>>> regular >>>>>>> and random load move between all 8 CPUs. Measurements on 5 minutes run >>>>>>> show >>>>>>> deviation of only about 5 seconds. It is the same deviation as I see >>>>>>> caused >>>>>>> by only scheduling of 16 threads on 8 cores without any balancing >>>>>>> needed >>>>>>> at >>>>>>> all. So I believe this code works as it should. >>>>>>> >>>>>>> Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch >>>>>>> >>>>>>> I plan this to be a final patch of this series (more to come :)) and >>>>>>> if >>>>>>> there will be no problems or objections, I am going to commit it >>>>>>> (except >>>>>>> some debugging KTRs) in about ten days. So now it's a good time for >>>>>>> reviews >>>>>>> and testing. :) >>>>>>> >>>>>> is there a place where all the patches are available ? >>>>> >>>>> >>>>> >>>>> All my scheduler patches are cumulative, so all you need is only the >>>>> last >>>>> mentioned here sched.htt40.patch. >>>>> >>>> You may want to have a look to the result I collected in the >>>> `runs/freebsd-experiments' branch of: >>>> >>>> https://github.com/lacombar/hackbench/ >>>> >>>> and compare them with vanilla FreeBSD 9.0 and -CURRENT results >>>> available in `runs/freebsd'. On the dual package platform, your patch >>>> is not a definite win. >>>> >>>>> But in some cases, especially for multi-socket systems, to let it show >>>>> its >>>>> best, you may want to apply additional patch from avg@ to better detect >>>>> CPU >>>>> topology: >>>>> >>>>> https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946a023db65c483cb9dd >>>>> >>>> test I conducted specifically for this patch did not showed much >>>> improvement... >>> >>> >>> Can you please clarify on this point? >>> The test you did included cases where the topology was detected badly >>> against cases where the topology was detected correctly as a patched >>> kernel (and you still didn't see a performance improvement), in terms >>> of cache line sharing? >> >> >> At this moment SCHED_ULE does almost nothing in terms of cache line sharing >> affinity (though it probably worth some further experiments). What this >> patch may improve is opposite case -- reduce cache sharing pressure for >> cache-hungry applications. For example, proper cache topology detection >> (such as lack of global L3 cache, but shared L2 per pairs of cores on >> Core2Quad class CPUs) increases pbzip2 performance when number of threads is >> less then number of CPUs (i.e. when there is place for optimization). > > My asking is not referred to your patch really. > I just wanted to know if he correctly benchmark a case where the > topology was screwed up and then correctly recognized by avg's patch > in terms of cache level aggregation (it wasn't referred to your patch > btw). I understand. I've just described test case when properly detected topology could give benefit. What the test really does is indeed a good question. -- Alexander Motin