From owner-freebsd-arch@freebsd.org Wed Aug 23 19:28:10 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3BA91DED97B for ; Wed, 23 Aug 2017 19:28:10 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 3F83E7F5DF; Wed, 23 Aug 2017 19:28:08 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id WAA18615; Wed, 23 Aug 2017 22:27:57 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1dkbJk-00004P-Vi; Wed, 23 Aug 2017 22:27:57 +0300 Subject: Re: ULE steal_idle questions To: Don Lewis , freebsd-arch@FreeBSD.org References: <201708231504.v7NF4nYe035934@gw.catspoiler.org> From: Andriy Gapon Message-ID: Date: Wed, 23 Aug 2017 22:26:36 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <201708231504.v7NF4nYe035934@gw.catspoiler.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Aug 2017 19:28:10 -0000 On 23/08/2017 18:04, Don Lewis wrote: > I've been looking at the steal_idle code in tdq_idled() and found some > things that puzzle me. > > Consider a machine with three CPUs: > A, which is idle > B, which is busy running a thread > C, which is busy running a thread and has another thread in queue > It would seem to make sense that the tdq_load values for these three > CPUs would be 0, 1, and 2 respectively in order to select the best CPU > to run a new thread. > > If so, then why do we pass thresh=1 to sched_highest() in the code that > implements steal_idle? That value is used to set cs_limit which is used > in this comparison in cpu_search: > if (match & CPU_SEARCH_HIGHEST) > if (tdq->tdq_load >= hgroup.cs_limit && > That would seem to make CPU B a candidate for stealing a thread from. > Ignoring CPU C for the moment, that shouldn't happen if the thread is > running, but even if it was possible, it would just make CPU B go idle, > which isn't terribly helpful in terms of load balancing and would just > thrash the caches. The same comparison is repeated in tdq_idled() after > a candidate CPU has been chosen: > if (steal->tdq_load < thresh || steal->tdq_transferable == 0) { > tdq_unlock_pair(tdq, steal); > continue; > } > > It looks to me like there is an off-by-one error here, and there is a > similar problem in the code that implements kern.sched.balance. I agree with your analysis. I had the same questions as well. I think that the tdq_transferable check is what saves the code from running into any problems. But it indeed would make sense for the code to understand that tdq_load includes a currently running, never transferable thread as well. > The reason I ask is that I've been debugging random segfaults and other > strange errors on my Ryzen machine and the problems mostly go away if I > either disable kern.sched.steal_idle and kern_sched.balance, or if I > leave kern_sched.steal_idle enabled and hack the code to change the > value of thresh from 1 to 2. See > for the gory > details. I don't know if my CPU has what AMD calls the "performance > marginality issue". I have been following your experiments and it's interesting that "massaging" the CPU in certain ways makes it a bit happier. But certainly the fault is with the CPU as the code is trouble-free on many different architectures including x86, and various processors from both Intel and AMD [with earlier CPU families]. -- Andriy Gapon