From owner-freebsd-arch@freebsd.org Mon Aug 21 13:54:03 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EA6E9DDCFA9 for ; Mon, 21 Aug 2017 13:54:03 +0000 (UTC) (envelope-from eliza.wilson@collabratesuite.com) Received: from mail-pg0-x248.google.com (mail-pg0-x248.google.com [IPv6:2607:f8b0:400e:c05::248]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C2B3370B8A for ; Mon, 21 Aug 2017 13:54:03 +0000 (UTC) (envelope-from eliza.wilson@collabratesuite.com) Received: by mail-pg0-x248.google.com with SMTP id t193so55208232pgc.0 for ; Mon, 21 Aug 2017 06:54:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=collabratesuite-com.20150623.gappssmtp.com; s=20150623; h=mime-version:message-id:date:subject:from:to; bh=Qa02a9kBA44yhbdV7IYsWOK51grxvNmSI0PYe9zaUiI=; b=XzQ9Rg+1CzxOWxbJoF3kB1WOWDrcU4MCCNOdBKpXGbggWYPQaMzTfjTZeyPWmVLLpl z3PnMaksVJMuJxMc8Ptz+WfBEoYv7fWA5OQRueMMII+XuHH23BlupcKcYd+gjYuQsxcy Oe3NYN2p02C2g8rl0Gz2V/WiGxQQ7tFTBlI4o4QSSEcqbwKPNikvRES1xcw6Sh0AS96t POtNsgyO9AcfMVPtgc9usKc06cRdykb/bcQkttsNLJfHBeI6q3L7OkWumRFNFUrEqlqb pbf580qg6Lkg5buKrzsnTUCASjJdmWPpCQaY6+wmVkN+KsWjlx7spXn77ZvSuJFsFls+ mFsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:message-id:date:subject:from:to; bh=Qa02a9kBA44yhbdV7IYsWOK51grxvNmSI0PYe9zaUiI=; b=A2MV/34B0Zqg7iOj7oJX38BgBWmkf6zo6ICfUO55t5lXBgfoo2fiEsIJuKek3jnWvJ gh4ohYwSfy6zRdGtkPCxCTj/TZgmiM+oVYS3Ya8k1H/gJlgnoJ3hoLERad5BETkfuO8U 9pJyHkXqNYjAN0qGTSqajSKPz/R2y16rdGF+IKSkSTqpX/pYfCH+1m55kEvCTMyNt4lx nC2PFl0VCfnCs7xaUeL8TFUJKBllnCPWyz9R4O0APRNLhjMAAX5yzEUDlrNlRykd7/XK KdZufwYS/VnA/nniG9yvfSOs58rwShL6ltOFZ3L1kOYYI92j0zvc7kpamLL8bKI6wq5P MEeQ== X-Gm-Message-State: AHYfb5iUZUouiGlBalBziWoXoRINyAk97p8ppT7NLhUCjcId4AhFG/10 BvkzEDLhgwqBcsr/sAS69JQp/ajxhqs9 MIME-Version: 1.0 X-Received: by 10.98.64.148 with SMTP id f20mr254261pfd.14.1503323643340; Mon, 21 Aug 2017 06:54:03 -0700 (PDT) Message-ID: <001a11499190be6f71055743cf10@google.com> Date: Mon, 21 Aug 2017 13:54:03 +0000 Subject: Gitex Technology Week - Exhibitors List From: eliza.wilson@collabratesuite.com To: freebsd-arch@freebsd.org Content-Type: text/plain; charset="UTF-8"; format=flowed; delsp=yes Content-Transfer-Encoding: base64 X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Aug 2017 13:54:04 -0000 PGRpdiBkaXI9Imx0ciI+PHAgIA0KY2xhc3M9ImdtYWlsLW1fLTYzOTAyMDAyMjY4MTQ0MjUzOTVn bWFpbC1tXzYxNjQyNzEwMTE3MzIxNTU2OTZnbWFpbC1tXy0yMTM1MjA3NzA1OTU1NzgwNDEzZ21h aWwtbV8tMTkyMDE2MTk2NjQ4NTMzNzQ1OG1fLTQ4MjI3NDE3MjkzNDU2NDEzNTRnbWFpbC1tXy05 MDc3MDc2MzQwMDIyNjcwNDlnbWFpbC1tXy0xODY5NzA0NzMyMTQwMDA2OTcwZ21haWwtbV8tMzA5 MzIxMzk3MzEwNjQxMTIyMGdtYWlsLU1zb05vU3BhY2luZyI+SGk8L3A+PHAgIA0KY2xhc3M9Imdt YWlsLW1fLTYzOTAyMDAyMjY4MTQ0MjUzOTVnbWFpbC1tXzYxNjQyNzEwMTE3MzIxNTU2OTZnbWFp bC1tXy0yMTM1MjA3NzA1OTU1NzgwNDEzZ21haWwtbV8tMTkyMDE2MTk2NjQ4NTMzNzQ1OG1fLTQ4 MjI3NDE3MjkzNDU2NDEzNTRnbWFpbC1tXy05MDc3MDc2MzQwMDIyNjcwNDlnbWFpbC1tXy0xODY5 NzA0NzMyMTQwMDA2OTcwZ21haWwtbV8tMzA5MzIxMzk3MzEwNjQxMTIyMGdtYWlsLU1zb05vU3Bh Y2luZyI+QmVsaWV2ZSAgDQpteSBlbWFpbCBmaW5kcyB5b3Ugd2VsbDwvcD48cCAgDQpjbGFzcz0i Z21haWwtbV8tNjM5MDIwMDIyNjgxNDQyNTM5NWdtYWlsLW1fNjE2NDI3MTAxMTczMjE1NTY5Nmdt YWlsLW1fLTIxMzUyMDc3MDU5NTU3ODA0MTNnbWFpbC1tXy0xOTIwMTYxOTY2NDg1MzM3NDU4bV8t NDgyMjc0MTcyOTM0NTY0MTM1NGdtYWlsLW1fLTkwNzcwNzYzNDAwMjI2NzA0OWdtYWlsLW1fLTE4 Njk3MDQ3MzIxNDAwMDY5NzBnbWFpbC1tXy0zMDkzMjEzOTczMTA2NDExMjIwZ21haWwtTXNvTm9T cGFjaW5nIj5JICANCmZvdW5kIHlvdXIgY29tcGFuaWVzIGxpc3RlZCBhcyBTcG9uc29ycyBhbmQg RXhoaWJpdG9ycyBhdCBHaXRleCBUZWNobm9sb2d5ICANCldlZWvCoGFuZCB0aG91Z2h0IHlvdSB3 b3VsZCBiZSBpbnRlcmVzdGVkIGluIHJlY2VpdmluZyBhdHRlbmRlZXMgbGlzdCBvZiB0aGUgIA0K dXBjb21pbmcgc2hvdy48L3A+PHAgIA0KY2xhc3M9ImdtYWlsLW1fLTYzOTAyMDAyMjY4MTQ0MjUz OTVnbWFpbC1tXzYxNjQyNzEwMTE3MzIxNTU2OTZnbWFpbC1tXy0yMTM1MjA3NzA1OTU1NzgwNDEz Z21haWwtbV8tMTkyMDE2MTk2NjQ4NTMzNzQ1OG1fLTQ4MjI3NDE3MjkzNDU2NDEzNTRnbWFpbC1t Xy05MDc3MDc2MzQwMDIyNjcwNDlnbWFpbC1tXy0xODY5NzA0NzMyMTQwMDA2OTcwZ21haWwtbV8t MzA5MzIxMzk3MzEwNjQxMTIyMGdtYWlsLU1zb05vU3BhY2luZyI+VGl0bGVzOiAgDQpDRU9zLCBD Rk9zLCBDT09zLCBDSU9zLCBDVE9zLCBTVlBzLCBWUHMsIEhlYWQsIERpcmVjdG9ycywgTWFuYWdl cnMgIA0KRXRjLjwvcD48cCAgDQpjbGFzcz0iZ21haWwtbV8tNjM5MDIwMDIyNjgxNDQyNTM5NWdt YWlsLW1fNjE2NDI3MTAxMTczMjE1NTY5NmdtYWlsLW1fLTIxMzUyMDc3MDU5NTU3ODA0MTNnbWFp bC1tXy0xOTIwMTYxOTY2NDg1MzM3NDU4bV8tNDgyMjc0MTcyOTM0NTY0MTM1NGdtYWlsLW1fLTkw NzcwNzYzNDAwMjI2NzA0OWdtYWlsLW1fLTE4Njk3MDQ3MzIxNDAwMDY5NzBnbWFpbC1tXy0zMDkz MjEzOTczMTA2NDExMjIwZ21haWwtTXNvTm9TcGFjaW5nIj5JbmZvcm1hdGlvbiAgDQpGaWVsZHM6 IE5hbWUsIENvbXBhbnkmIzM5O3MgTmFtZSwgUGhvbmUgTnVtYmVyLCBGYXggTnVtYmVyLCBKb2Ig VGl0bGUsICANCkVtYWlsIGFkZHJlc3MsIENvbXBsZXRlIE1haWxpbmcgQWRkcmVzcywgQ29tcGFu eSBSZXZlbnVlIFNpemUsIEVtcGxveWVlICANClNpemUsIFdlYiBhZGRyZXNzIGV0Yy48L3A+PHAg IA0KY2xhc3M9ImdtYWlsLW1fLTYzOTAyMDAyMjY4MTQ0MjUzOTVnbWFpbC1tXzYxNjQyNzEwMTE3 MzIxNTU2OTZnbWFpbC1tXy0yMTM1MjA3NzA1OTU1NzgwNDEzZ21haWwtbV8tMTkyMDE2MTk2NjQ4 NTMzNzQ1OG1fLTQ4MjI3NDE3MjkzNDU2NDEzNTRnbWFpbC1tXy05MDc3MDc2MzQwMDIyNjcwNDln bWFpbC1tXy0xODY5NzA0NzMyMTQwMDA2OTcwZ21haWwtbV8tMzA5MzIxMzk3MzEwNjQxMTIyMGdt YWlsLU1zb05vU3BhY2luZyI+R2VvZ3JhcGh5ICANCldlIENvdmVyOiBOb3J0aCBBbWVyaWNhLCBM YXRpbiBBbWVyaWNhLCBFTUVBLCBhbmQgQVBBQy48L3A+PHAgIA0KY2xhc3M9ImdtYWlsLW1fLTYz OTAyMDAyMjY4MTQ0MjUzOTVnbWFpbC1tXzYxNjQyNzEwMTE3MzIxNTU2OTZnbWFpbC1tXy0yMTM1 MjA3NzA1OTU1NzgwNDEzZ21haWwtbV8tMTkyMDE2MTk2NjQ4NTMzNzQ1OG1fLTQ4MjI3NDE3Mjkz NDU2NDEzNTRnbWFpbC1tXy05MDc3MDc2MzQwMDIyNjcwNDlnbWFpbC1tXy0xODY5NzA0NzMyMTQw MDA2OTcwZ21haWwtbV8tMzA5MzIxMzk3MzEwNjQxMTIyMGdtYWlsLU1zb05vU3BhY2luZyI+UGxl YXNlICANCnJldmlldyBhbmQgbGV0IG1lIGtub3cgeW91ciB0aG91Z2h0cyBhbmQgSSB3aWxsIGdl dCBiYWNrIHRvIHlvdSB3aXRoIG1vcmUgIA0KaW5mb3JtYXRpb24gcmVnYXJkaW5nIGNvdW50cyBh bmQgcHJpY2luZyBmb3IgdGhlIHNhbWUuPC9wPjxwICANCmNsYXNzPSJnbWFpbC1tXy02MzkwMjAw MjI2ODE0NDI1Mzk1Z21haWwtbV82MTY0MjcxMDExNzMyMTU1Njk2Z21haWwtbV8tMjEzNTIwNzcw NTk1NTc4MDQxM2dtYWlsLW1fLTE5MjAxNjE5NjY0ODUzMzc0NThtXy00ODIyNzQxNzI5MzQ1NjQx MzU0Z21haWwtbV8tOTA3NzA3NjM0MDAyMjY3MDQ5Z21haWwtbV8tMTg2OTcwNDczMjE0MDAwNjk3 MGdtYWlsLW1fLTMwOTMyMTM5NzMxMDY0MTEyMjBnbWFpbC1Nc29Ob1NwYWNpbmciPkF3YWl0ICAN CnlvdXIgcmVzcG9uc2UhPC9wPjxwICANCmNsYXNzPSJnbWFpbC1tXy02MzkwMjAwMjI2ODE0NDI1 Mzk1Z21haWwtbV82MTY0MjcxMDExNzMyMTU1Njk2Z21haWwtbV8tMjEzNTIwNzcwNTk1NTc4MDQx M2dtYWlsLW1fLTE5MjAxNjE5NjY0ODUzMzc0NThtXy00ODIyNzQxNzI5MzQ1NjQxMzU0Z21haWwt bV8tOTA3NzA3NjM0MDAyMjY3MDQ5Z21haWwtbV8tMTg2OTcwNDczMjE0MDAwNjk3MGdtYWlsLW1f LTMwOTMyMTM5NzMxMDY0MTEyMjBnbWFpbC1Nc29Ob1NwYWNpbmciPlJlZ2FyZHMsPC9wPjxwICAN CmNsYXNzPSJnbWFpbC1tXy02MzkwMjAwMjI2ODE0NDI1Mzk1Z21haWwtbV82MTY0MjcxMDExNzMy MTU1Njk2Z21haWwtbV8tMjEzNTIwNzcwNTk1NTc4MDQxM2dtYWlsLW1fLTE5MjAxNjE5NjY0ODUz Mzc0NThtXy00ODIyNzQxNzI5MzQ1NjQxMzU0Z21haWwtbV8tOTA3NzA3NjM0MDAyMjY3MDQ5Z21h aWwtbV8tMTg2OTcwNDczMjE0MDAwNjk3MGdtYWlsLW1fLTMwOTMyMTM5NzMxMDY0MTEyMjBnbWFp bC1Nc29Ob1NwYWNpbmciPkVsaXphICANCldpbHNvbjwvcD48cCAgDQpjbGFzcz0iZ21haWwtbV8t NjM5MDIwMDIyNjgxNDQyNTM5NWdtYWlsLW1fNjE2NDI3MTAxMTczMjE1NTY5NmdtYWlsLW1fLTIx MzUyMDc3MDU5NTU3ODA0MTNnbWFpbC1tXy0xOTIwMTYxOTY2NDg1MzM3NDU4bV8tNDgyMjc0MTcy OTM0NTY0MTM1NGdtYWlsLW1fLTkwNzcwNzYzNDAwMjI2NzA0OWdtYWlsLW1fLTE4Njk3MDQ3MzIx NDAwMDY5NzBnbWFpbC1tXy0zMDkzMjEzOTczMTA2NDExMjIwZ21haWwtTXNvTm9TcGFjaW5nIj48 YnI+PC9wPjxwICANCmNsYXNzPSJnbWFpbC1tXy02MzkwMjAwMjI2ODE0NDI1Mzk1Z21haWwtbV82 MTY0MjcxMDExNzMyMTU1Njk2Z21haWwtbV8tMjEzNTIwNzcwNTk1NTc4MDQxM2dtYWlsLW1fLTE5 MjAxNjE5NjY0ODUzMzc0NThtXy00ODIyNzQxNzI5MzQ1NjQxMzU0Z21haWwtbV8tOTA3NzA3NjM0 MDAyMjY3MDQ5Z21haWwtbV8tMTg2OTcwNDczMjE0MDAwNjk3MGdtYWlsLW1fLTMwOTMyMTM5NzMx MDY0MTEyMjBnbWFpbC1Nc29Ob1NwYWNpbmciPlRvICANCk9wdCBPdXQsIHBsZWFzZSBhbnN3ZXIg d2l0aCBFeGNsdWRlIGluIHRoZSBTdWJqZWN0IExpbmUuPC9wPjwvZGl2Pg0KPHA+Jm5ic3A7PC9w PjxhIHN0eWxlPSdkaXNwbGF5OiBibG9jazsgbWFyZ2luOiAzMnB4IDAgNDBweCAwOyBwYWRkaW5n OiAgDQoxMHB4OyBmb250LXNpemU6IDFlbTsgdGV4dC1hbGlnbjogY2VudGVyOyBib3JkZXI6IDA7 IGJvcmRlci10b3A6IDFweCBzb2xpZCAgDQpncmF5OyAnIGhyZWY9J2h0dHBzOi8vZ29vLmdsLzJr c2RSdic+cG93ZXJlZCBieSBHU00uIEZyZWUgbWFpbCBtZXJnZSBhbmQgIA0KZW1haWwgbWFya2V0 aW5nIHNvZnR3YXJlIGZvciBHbWFpbC48L2E+DQo= From owner-freebsd-arch@freebsd.org Wed Aug 23 15:04:57 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3D72DDE9400 for ; Wed, 23 Aug 2017 15:04:57 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 1C86D73E01 for ; Wed, 23 Aug 2017 15:04:57 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7NF4nYe035934 for ; Wed, 23 Aug 2017 08:04:53 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708231504.v7NF4nYe035934@gw.catspoiler.org> Date: Wed, 23 Aug 2017 08:04:49 -0700 (PDT) From: Don Lewis Subject: ULE steal_idle questions To: freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Aug 2017 15:04:57 -0000 I've been looking at the steal_idle code in tdq_idled() and found some things that puzzle me. Consider a machine with three CPUs: A, which is idle B, which is busy running a thread C, which is busy running a thread and has another thread in queue It would seem to make sense that the tdq_load values for these three CPUs would be 0, 1, and 2 respectively in order to select the best CPU to run a new thread. If so, then why do we pass thresh=1 to sched_highest() in the code that implements steal_idle? That value is used to set cs_limit which is used in this comparison in cpu_search: if (match & CPU_SEARCH_HIGHEST) if (tdq->tdq_load >= hgroup.cs_limit && That would seem to make CPU B a candidate for stealing a thread from. Ignoring CPU C for the moment, that shouldn't happen if the thread is running, but even if it was possible, it would just make CPU B go idle, which isn't terribly helpful in terms of load balancing and would just thrash the caches. The same comparison is repeated in tdq_idled() after a candidate CPU has been chosen: if (steal->tdq_load < thresh || steal->tdq_transferable == 0) { tdq_unlock_pair(tdq, steal); continue; } It looks to me like there is an off-by-one error here, and there is a similar problem in the code that implements kern.sched.balance. The reason I ask is that I've been debugging random segfaults and other strange errors on my Ryzen machine and the problems mostly go away if I either disable kern.sched.steal_idle and kern_sched.balance, or if I leave kern_sched.steal_idle enabled and hack the code to change the value of thresh from 1 to 2. See for the gory details. I don't know if my CPU has what AMD calls the "performance marginality issue". From owner-freebsd-arch@freebsd.org Wed Aug 23 19:28:10 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3BA91DED97B for ; Wed, 23 Aug 2017 19:28:10 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 3F83E7F5DF; Wed, 23 Aug 2017 19:28:08 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id WAA18615; Wed, 23 Aug 2017 22:27:57 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1dkbJk-00004P-Vi; Wed, 23 Aug 2017 22:27:57 +0300 Subject: Re: ULE steal_idle questions To: Don Lewis , freebsd-arch@FreeBSD.org References: <201708231504.v7NF4nYe035934@gw.catspoiler.org> From: Andriy Gapon Message-ID: Date: Wed, 23 Aug 2017 22:26:36 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <201708231504.v7NF4nYe035934@gw.catspoiler.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Aug 2017 19:28:10 -0000 On 23/08/2017 18:04, Don Lewis wrote: > I've been looking at the steal_idle code in tdq_idled() and found some > things that puzzle me. > > Consider a machine with three CPUs: > A, which is idle > B, which is busy running a thread > C, which is busy running a thread and has another thread in queue > It would seem to make sense that the tdq_load values for these three > CPUs would be 0, 1, and 2 respectively in order to select the best CPU > to run a new thread. > > If so, then why do we pass thresh=1 to sched_highest() in the code that > implements steal_idle? That value is used to set cs_limit which is used > in this comparison in cpu_search: > if (match & CPU_SEARCH_HIGHEST) > if (tdq->tdq_load >= hgroup.cs_limit && > That would seem to make CPU B a candidate for stealing a thread from. > Ignoring CPU C for the moment, that shouldn't happen if the thread is > running, but even if it was possible, it would just make CPU B go idle, > which isn't terribly helpful in terms of load balancing and would just > thrash the caches. The same comparison is repeated in tdq_idled() after > a candidate CPU has been chosen: > if (steal->tdq_load < thresh || steal->tdq_transferable == 0) { > tdq_unlock_pair(tdq, steal); > continue; > } > > It looks to me like there is an off-by-one error here, and there is a > similar problem in the code that implements kern.sched.balance. I agree with your analysis. I had the same questions as well. I think that the tdq_transferable check is what saves the code from running into any problems. But it indeed would make sense for the code to understand that tdq_load includes a currently running, never transferable thread as well. > The reason I ask is that I've been debugging random segfaults and other > strange errors on my Ryzen machine and the problems mostly go away if I > either disable kern.sched.steal_idle and kern_sched.balance, or if I > leave kern_sched.steal_idle enabled and hack the code to change the > value of thresh from 1 to 2. See > for the gory > details. I don't know if my CPU has what AMD calls the "performance > marginality issue". I have been following your experiments and it's interesting that "massaging" the CPU in certain ways makes it a bit happier. But certainly the fault is with the CPU as the code is trouble-free on many different architectures including x86, and various processors from both Intel and AMD [with earlier CPU families]. -- Andriy Gapon From owner-freebsd-arch@freebsd.org Wed Aug 23 20:58:51 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8C662DEED20 for ; Wed, 23 Aug 2017 20:58:51 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FA3E818CB; Wed, 23 Aug 2017 20:58:51 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7NKwg2a037103; Wed, 23 Aug 2017 13:58:46 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708232058.v7NKwg2a037103@gw.catspoiler.org> Date: Wed, 23 Aug 2017 13:58:42 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: avg@FreeBSD.org cc: freebsd-arch@FreeBSD.org In-Reply-To: MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Aug 2017 20:58:51 -0000 On 23 Aug, Andriy Gapon wrote: > On 23/08/2017 18:04, Don Lewis wrote: >> I've been looking at the steal_idle code in tdq_idled() and found some >> things that puzzle me. >> >> Consider a machine with three CPUs: >> A, which is idle >> B, which is busy running a thread >> C, which is busy running a thread and has another thread in queue >> It would seem to make sense that the tdq_load values for these three >> CPUs would be 0, 1, and 2 respectively in order to select the best CPU >> to run a new thread. >> >> If so, then why do we pass thresh=1 to sched_highest() in the code that >> implements steal_idle? That value is used to set cs_limit which is used >> in this comparison in cpu_search: >> if (match & CPU_SEARCH_HIGHEST) >> if (tdq->tdq_load >= hgroup.cs_limit && >> That would seem to make CPU B a candidate for stealing a thread from. >> Ignoring CPU C for the moment, that shouldn't happen if the thread is >> running, but even if it was possible, it would just make CPU B go idle, >> which isn't terribly helpful in terms of load balancing and would just >> thrash the caches. The same comparison is repeated in tdq_idled() after >> a candidate CPU has been chosen: >> if (steal->tdq_load < thresh || steal->tdq_transferable == 0) { >> tdq_unlock_pair(tdq, steal); >> continue; >> } >> >> It looks to me like there is an off-by-one error here, and there is a >> similar problem in the code that implements kern.sched.balance. > > > I agree with your analysis. I had the same questions as well. > I think that the tdq_transferable check is what saves the code from > running into any problems. But it indeed would make sense for the code > to understand that tdq_load includes a currently running, never > transferable thread as well. Yes, I think that the tdq_transferable check will fail, but at the cost of an unnecessary tdq_lock_pair()/tdq_unlock_pair() and another loop iteration, including another expensive sched_highest() call. Consider the case of a close to fully loaded system where all of the other CPUs each have one running thread. The current code will try each of the other CPUs, calling tdq_lock_pair(), failing the tdq_transferable check, calling tdq_unlock_pair(), then restarting the loop with that CPU removed from mask, and all of this done with interrupts disabled. The proper thing to do in this case would be to just go into the idle state. >> The reason I ask is that I've been debugging random segfaults and other >> strange errors on my Ryzen machine and the problems mostly go away if I >> either disable kern.sched.steal_idle and kern_sched.balance, or if I >> leave kern_sched.steal_idle enabled and hack the code to change the >> value of thresh from 1 to 2. See >> for the gory >> details. I don't know if my CPU has what AMD calls the "performance >> marginality issue". > > I have been following your experiments and it's interesting that > "massaging" the CPU in certain ways makes it a bit happier. But > certainly the fault is with the CPU as the code is trouble-free on many > different architectures including x86, and various processors from both > Intel and AMD [with earlier CPU families]. The results of my experiments so far are pointing to the time spent looping in tdq_idled() as at least one cause of the problems. If I restrict tdq_idled() to look at just a single core, things are happy. If I then set steal_thresh to 1 so that it has to loop, then things get unhappy. If I allow tdq_idled() to look at both the current core and current CCX with thresh at the CCX level set to 1 so that the code loops, things are unhappy. If I set thresh to 2 so that the code does not loop unnecessarily, things get happy again. If I allow tdq_idled() to look at the entire topology (so that there are now three calls to sched_highest(), things get unhappy again. There is a known issue with executing IRET on one SMT thread when the other SMT thread on that core is "busy", but I don't know how that would affect things. Perhaps that can be fixed in microcode. sched_highest() looks like it is really expensive in terms of CPU cycles. On Ryzen, if we can't find a suitable thread on the current CXX to transfer and step up to the chip level, sched_highest() will recalculate the load on the current CCX even though we have already rejected it. Things get worse when using cpuset because if tdq_move() fails due to cpuset (or other) restrictions, then we call sched_highest() all over again to redo all the calculations, but with the previously chosen CPU removed from the potential choices. This will get even worse with Threadripper and beyond. Even if it doesn't cause obvious breakage, it is bad for interrupt latency. I'm not convinced that the ghc and go problems are Ryzen bugs and not bugs in the code for those two ports. I've never seen build failures for those on my FX-8320E, but the increased number of threads on Ryzen might expose some latent problems. From owner-freebsd-arch@freebsd.org Thu Aug 24 13:00:46 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6B31DDDF01E for ; Thu, 24 Aug 2017 13:00:46 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E72D376CFC; Thu, 24 Aug 2017 13:00:45 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v7OD0a6l060145 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 24 Aug 2017 16:00:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v7OD0a6l060145 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v7OD0anc060144; Thu, 24 Aug 2017 16:00:36 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 24 Aug 2017 16:00:36 +0300 From: Konstantin Belousov To: Don Lewis Cc: avg@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: ULE steal_idle questions Message-ID: <20170824130036.GC1700@kib.kiev.ua> References: <201708232058.v7NKwg2a037103@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201708232058.v7NKwg2a037103@gw.catspoiler.org> User-Agent: Mutt/1.8.3 (2017-05-23) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Aug 2017 13:00:46 -0000 On Wed, Aug 23, 2017 at 01:58:42PM -0700, Don Lewis wrote: > I'm not convinced that the ghc and go problems are Ryzen bugs and not > bugs in the code for those two ports. I've never seen build failures > for those on my FX-8320E, but the increased number of threads on Ryzen > might expose some latent problems. > Could you post the verbose dmesg from Ryzen's boot somewhere ? From owner-freebsd-arch@freebsd.org Thu Aug 24 15:52:28 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A1862DE295C for ; Thu, 24 Aug 2017 15:52:28 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6CD7D81819; Thu, 24 Aug 2017 15:52:28 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7OFqISP042657; Thu, 24 Aug 2017 08:52:22 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708241552.v7OFqISP042657@gw.catspoiler.org> Date: Thu, 24 Aug 2017 08:52:17 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: kostikbel@gmail.com cc: avg@FreeBSD.org, freebsd-arch@FreeBSD.org In-Reply-To: <20170824130036.GC1700@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Aug 2017 15:52:28 -0000 On 24 Aug, Konstantin Belousov wrote: > On Wed, Aug 23, 2017 at 01:58:42PM -0700, Don Lewis wrote: >> I'm not convinced that the ghc and go problems are Ryzen bugs and not >> bugs in the code for those two ports. I've never seen build failures >> for those on my FX-8320E, but the increased number of threads on Ryzen >> might expose some latent problems. >> > > Could you post the verbose dmesg from Ryzen's boot somewhere ? https://people.freebsd.org/~truckman/ryzen-dmesg.boot From owner-freebsd-arch@freebsd.org Thu Aug 24 16:41:11 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 374DFDE3CED for ; Thu, 24 Aug 2017 16:41:11 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 17CED83C69; Thu, 24 Aug 2017 16:41:11 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7OGf3pA042851; Thu, 24 Aug 2017 09:41:07 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708241641.v7OGf3pA042851@gw.catspoiler.org> Date: Thu, 24 Aug 2017 09:41:03 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: avg@FreeBSD.org cc: freebsd-arch@FreeBSD.org In-Reply-To: MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Aug 2017 16:41:11 -0000 Aside from the Ryzen problem, I think the steal_idle code should be re-written so that it doesn't block interrupts for so long. In its current state, interrupt latence increases with the number of cores and the complexity of the topology. What I'm thinking is that we should set a flag at the start of the search for a thread to steal. If we are preempted by another, higher priority thread, that thread will clear the flag. Next we start the loop to search up the hierarchy. Once we find a candidate CPU: steal = TDQ_CPU(cpu); CPU_CLR(cpu, &mask); tdq_lock_pair(tdq, steal); if (tdq->tdq_load != 0) { goto out; /* to exit loop and switch to the new thread */ } if (flag was cleared) { tdq_unlock_pair(tdq, steal); goto restart; /* restart the search */ } if (steal->tdq_load < thresh || steal->tdq_transferable == 0 || tdq_move(steal, tdq) == 0) { tdq_unlock_pair(tdq, steal); continue; } out: TDQ_UNLOCK(steal); clear flag; mi_switch(SW_VOL | SWT_IDLE, NULL); thread_unlock(curthread); return (0); And we also have to clear the flag if we did not find a thread to steal. From owner-freebsd-arch@freebsd.org Thu Aug 24 19:25:22 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 05A25DE6869 for ; Thu, 24 Aug 2017 19:25:22 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id BD4B33FAD; Thu, 24 Aug 2017 19:25:21 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7OJPDTT043392; Thu, 24 Aug 2017 12:25:17 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708241925.v7OJPDTT043392@gw.catspoiler.org> Date: Thu, 24 Aug 2017 12:25:13 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: avg@FreeBSD.org cc: freebsd-arch@FreeBSD.org In-Reply-To: MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Aug 2017 19:25:22 -0000 On 23 Aug, Andriy Gapon wrote: > On 23/08/2017 18:04, Don Lewis wrote: >> I've been looking at the steal_idle code in tdq_idled() and found some >> things that puzzle me. >> >> Consider a machine with three CPUs: >> A, which is idle >> B, which is busy running a thread >> C, which is busy running a thread and has another thread in queue >> It would seem to make sense that the tdq_load values for these three >> CPUs would be 0, 1, and 2 respectively in order to select the best CPU >> to run a new thread. >> >> If so, then why do we pass thresh=1 to sched_highest() in the code that >> implements steal_idle? That value is used to set cs_limit which is used >> in this comparison in cpu_search: >> if (match & CPU_SEARCH_HIGHEST) >> if (tdq->tdq_load >= hgroup.cs_limit && >> That would seem to make CPU B a candidate for stealing a thread from. >> Ignoring CPU C for the moment, that shouldn't happen if the thread is >> running, but even if it was possible, it would just make CPU B go idle, >> which isn't terribly helpful in terms of load balancing and would just >> thrash the caches. The same comparison is repeated in tdq_idled() after >> a candidate CPU has been chosen: >> if (steal->tdq_load < thresh || steal->tdq_transferable == 0) { >> tdq_unlock_pair(tdq, steal); >> continue; >> } >> >> It looks to me like there is an off-by-one error here, and there is a >> similar problem in the code that implements kern.sched.balance. > > > I agree with your analysis. I had the same questions as well. > I think that the tdq_transferable check is what saves the code from > running into any problems. But it indeed would make sense for the code > to understand that tdq_load includes a currently running, never > transferable thread as well. Things aren't quite as bad as I initially thought. cpu_search() does look at tdq_transferable so sched_highest() should not return a cpu that does not have a transferable thread at the time it was examined, so in most cases the unnecessary lock/unlock shouldn't happen. The extra check after the lock will catch the case where tdq_transferable went to zero between when it was examined by cpu_search() and when we actually grabbed the lock. Using a larger thresh value for SMT threads is still a no-op, though. From owner-freebsd-arch@freebsd.org Fri Aug 25 18:01:25 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 78001DDDA3E for ; Fri, 25 Aug 2017 18:01:25 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 658D069AD0 for ; Fri, 25 Aug 2017 18:01:25 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v7PI1PFl015102 for ; Fri, 25 Aug 2017 18:01:25 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-arch@FreeBSD.org Subject: [Bug 221813] Mechanism is needed to utilize discontinuous memory segments supplied by the EFI and other boot envs Date: Fri, 25 Aug 2017 18:01:25 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: uefi X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: sobomax@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform bug_file_loc op_sys bug_status keywords bug_severity priority component assigned_to reporter cc Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Aug 2017 18:01:25 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D221813 Bug ID: 221813 Summary: Mechanism is needed to utilize discontinuous memory segments supplied by the EFI and other boot envs Product: Base System Version: CURRENT Hardware: Any URL: http://bit.ly/2vdC30w OS: Any Status: New Keywords: uefi Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: sobomax@FreeBSD.org CC: freebsd-arch@FreeBSD.org, marcel@FreeBSD.org This is a follow-up ticket created after good discussion along the lines of= bug #211746, particularly good comment that sums up the essence of it is quoted below. In our particular case this causes to inability to use more than 64MB or me= mory to load kernel and images. The solution could also be some kind of scatter-gather mechanism for blobs or memory, so if the blob is to big to f= it into single chunk then loader splits it over multiple regions and lets kern= el do its VM magic to stitch pages back together in the KVA. This will leave us with 64MB of data for the kernel, but at least we would be able to pre-load much larger images. -------------------------------- Marcel Moolenaar freebsd_committer 2017-02-17 18:34:37 UTC I think the complexity of having the kernel at any other physical address is what has us do the staging/copying. It was a quick-n-dirty mechanism that avoided a lot of work and complexity -- which is ok if you don't know it's worth/needed to go through all that hassle. And I guess it looks like we now hit a case that warrants us to start looking at a real solution. As an example (for inspiration): For Itanium I had the kernel link against a fixed virtual address. The load= er built the VA-to-PA mapping based on where EFI allocated blobs of memory. The mapping was loaded/activated prior to booting the kernel and the loader gave the kernel all the information it needed to work with the mapping. This mak= es it possible to allocate memory before the VM system is up and running. Ultimately the mapping needs to be incorporated into the VM system and this= is where different CPU architectures have different challenges and solutions. Note that there are advantages to having the kernel link against a virtual address. In general it makes it easier to load or relocate the kernel anywh= ere and this enables a few capabilities that other OSes already have and then s= ome. There are also downsides. You may need to support a large VA range if you w= ant to support pre-loading CD-ROM images or run entirely form a memory disk tha= t's preloaded. A few GB of address space would be good to have. Anyway: It's probably time that to you restate this bug into an architectur= al (x86-specific for now) problem and have a discussion on the arch@ mailing l= ist. We need a more people involved to bring this to a closure. Good luck -------------------------------- --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-arch@freebsd.org Fri Aug 25 18:04:36 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7D8F5DDDD6C for ; Fri, 25 Aug 2017 18:04:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1C6E769FF6 for ; Fri, 25 Aug 2017 18:04:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v7PI4ZYm040509 for ; Fri, 25 Aug 2017 18:04:35 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-arch@FreeBSD.org Subject: [Bug 221813] Mechanism is needed to utilize discontinuous memory segments supplied by the EFI and other boot envs Date: Fri, 25 Aug 2017 18:04:36 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: uefi X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: sobomax@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Aug 2017 18:04:36 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D221813 --- Comment #1 from Maxim Sobolev --- Our memory map (Windows 10, built-in hyper-v) is here: http://bit.ly/2vdC30w --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-arch@freebsd.org Fri Aug 25 18:24:19 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F14B9DDE844 for ; Fri, 25 Aug 2017 18:24:19 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B4C3B6B765; Fri, 25 Aug 2017 18:24:19 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7PIOA6q048321; Fri, 25 Aug 2017 11:24:14 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708251824.v7PIOA6q048321@gw.catspoiler.org> Date: Fri, 25 Aug 2017 11:24:10 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: avg@FreeBSD.org cc: freebsd-arch@FreeBSD.org In-Reply-To: <201708241641.v7OGf3pA042851@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Aug 2017 18:24:20 -0000 On 24 Aug, To: avg@FreeBSD.org wrote: > Aside from the Ryzen problem, I think the steal_idle code should be > re-written so that it doesn't block interrupts for so long. In its > current state, interrupt latence increases with the number of cores and > the complexity of the topology. > > What I'm thinking is that we should set a flag at the start of the > search for a thread to steal. If we are preempted by another, higher > priority thread, that thread will clear the flag. Next we start the > loop to search up the hierarchy. Once we find a candidate CPU: > > steal = TDQ_CPU(cpu); > CPU_CLR(cpu, &mask); > tdq_lock_pair(tdq, steal); > if (tdq->tdq_load != 0) { > goto out; /* to exit loop and switch to the new thread */ > } > if (flag was cleared) { > tdq_unlock_pair(tdq, steal); > goto restart; /* restart the search */ > } > if (steal->tdq_load < thresh || steal->tdq_transferable == 0 || > tdq_move(steal, tdq) == 0) { > tdq_unlock_pair(tdq, steal); > continue; > } > out: > TDQ_UNLOCK(steal); > clear flag; > mi_switch(SW_VOL | SWT_IDLE, NULL); > thread_unlock(curthread); > return (0); > > And we also have to clear the flag if we did not find a thread to steal. I've implemented something like this and added a bunch of counters to it to get a better understanding of its behavior. Instead of adding a flag to detect preemption, I used the same switchcnt test as is used by sched_idletd(). These are the results of a ~9 hour poudriere run: kern.sched.steal.none: 9971668 # no threads were stolen kern.sched.steal.fail: 23709 # unable to steal from cpu=sched_highest() kern.sched.steal.level2: 191839 # somewhere on this chip kern.sched.steal.level1: 557659 # a core on this CCX kern.sched.steal.level0: 4555426 # the other SMT thread on this core kern.sched.steal.restart: 404 # preemption detected so restart the search kern.sched.steal.call: 15276638 # of times tdq_idled() called There are a few surprises here. One is the number of failed moves. I don't know if the load on the source CPU fell below thresh, tdq_transferable went to zero, or if tdq_move() failed. I also wonder if the failures are evenly distributed across CPUs. It is possible that these failures are concentrated on CPU 0, which handles most interrupts. If interrupts don't affect switchcnt, then the data collected by sched_highest() could be a bit stale and we would not know it. Something else that I did not expect is the how frequently threads are stolen from the other SMT thread on the same core, even though I increased steal_thresh from 2 to 3 to account for the off-by-one problem. This is true even right after the system has booted and no significant load has been applied. My best guess is that because of affinity, both the parent and child processes run on the same CPU after fork(), and if a number of processes are forked() in quick succession, the run queue of that CPU can get really long. Forcing a thread migration in exec() might be a good solution. From owner-freebsd-arch@freebsd.org Fri Aug 25 23:03:58 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 96AD4DE3623 for ; Fri, 25 Aug 2017 23:03:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 844717436F for ; Fri, 25 Aug 2017 23:03:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v7PN3wv0052472 for ; Fri, 25 Aug 2017 23:03:58 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-arch@FreeBSD.org Subject: [Bug 221813] Mechanism is needed to utilize discontinuous memory segments supplied by the EFI and other boot envs Date: Fri, 25 Aug 2017 23:03:58 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: uefi X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Aug 2017 23:03:58 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D221813 Konstantin Belousov changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kib@FreeBSD.org --- Comment #2 from Konstantin Belousov --- EFI model of handing control from loader to OS is dictated by Windows loade= r.=20 There, loader constructs kernel virtual address space and establishes the initial mappings. Most ample demonstration of the approach is with the run= time services abomination requirement that loader provides the future mapping of runtime segments to firmware, while kernel did not even started. Change of amd64 loader/kernel interaction to adopt this model is possible, = but I am not sure that it is worth the efforts. At least, I do not consider the use cases of large preloaded md as enough justification for all the work required, and for causing flag day where new kernel will absolutely require= new loader. --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-arch@freebsd.org Sat Aug 26 00:28:55 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DF00DE4E46 for ; Sat, 26 Aug 2017 00:28:55 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 262F176A40; Sat, 26 Aug 2017 00:28:54 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 84DA51020B7; Sat, 26 Aug 2017 10:28:51 +1000 (AEST) Date: Sat, 26 Aug 2017 10:28:50 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Don Lewis cc: avg@freebsd.org, freebsd-arch@freebsd.org Subject: Re: ULE steal_idle questions In-Reply-To: <201708251824.v7PIOA6q048321@gw.catspoiler.org> Message-ID: <20170826094725.G1648@besplex.bde.org> References: <201708251824.v7PIOA6q048321@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=LI0WeNe9 c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=uoTqs28qTOXnMGlVIiMA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 00:28:55 -0000 On Fri, 25 Aug 2017, Don Lewis wrote: > ... > Something else that I did not expect is the how frequently threads are > stolen from the other SMT thread on the same core, even though I > increased steal_thresh from 2 to 3 to account for the off-by-one > problem. This is true even right after the system has booted and no > significant load has been applied. My best guess is that because of > affinity, both the parent and child processes run on the same CPU after > fork(), and if a number of processes are forked() in quick succession, > the run queue of that CPU can get really long. Forcing a thread > migration in exec() might be a good solution. Since you are trying a lot of combinations, maybe you can tell us which ones work best. SCHED_4BSD works better for me on an old 2-core system. SCHED_ULE works better on a not-so old 4x2 core (Haswell) system, but I don't like it due to its complexity. It makes differences of at most +-2% except when mistuned it can give -5% for real time (but better for CPU and presumably power). For SCHED_4BSD, I wrote fancy tuning for fork/exec and sometimes get everything to like up for a 3% improvement (803 seconds instead of 823 on the old system, with -current much slower at 840+ and old versions of ULE before steal_idle taking 890+). This is very resource (mainly cache associativity?) dependent and my tuning makes little difference on the newer system. SCHED_ULE still has bugfeatures which tend to help large builds by reducing context switching, e.g., by bogusly clamping all CPU-bound threads to nearly maximal priority. Bruce From owner-freebsd-arch@freebsd.org Sat Aug 26 17:50:25 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 02D06DD6A9B for ; Sat, 26 Aug 2017 17:50:25 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B683072746; Sat, 26 Aug 2017 17:50:24 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7QHoG2c053745; Sat, 26 Aug 2017 10:50:20 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708261750.v7QHoG2c053745@gw.catspoiler.org> Date: Sat, 26 Aug 2017 10:50:16 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: avg@FreeBSD.org cc: freebsd-arch@FreeBSD.org In-Reply-To: <201708251824.v7PIOA6q048321@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 17:50:25 -0000 On 25 Aug, To: avg@FreeBSD.org wrote: > On 24 Aug, To: avg@FreeBSD.org wrote: >> Aside from the Ryzen problem, I think the steal_idle code should be >> re-written so that it doesn't block interrupts for so long. In its >> current state, interrupt latence increases with the number of cores and >> the complexity of the topology. >> >> What I'm thinking is that we should set a flag at the start of the >> search for a thread to steal. If we are preempted by another, higher >> priority thread, that thread will clear the flag. Next we start the >> loop to search up the hierarchy. Once we find a candidate CPU: >> >> steal = TDQ_CPU(cpu); >> CPU_CLR(cpu, &mask); >> tdq_lock_pair(tdq, steal); >> if (tdq->tdq_load != 0) { >> goto out; /* to exit loop and switch to the new thread */ >> } >> if (flag was cleared) { >> tdq_unlock_pair(tdq, steal); >> goto restart; /* restart the search */ >> } >> if (steal->tdq_load < thresh || steal->tdq_transferable == 0 || >> tdq_move(steal, tdq) == 0) { >> tdq_unlock_pair(tdq, steal); >> continue; >> } >> out: >> TDQ_UNLOCK(steal); >> clear flag; >> mi_switch(SW_VOL | SWT_IDLE, NULL); >> thread_unlock(curthread); >> return (0); >> >> And we also have to clear the flag if we did not find a thread to steal. > > I've implemented something like this and added a bunch of counters to it > to get a better understanding of its behavior. Instead of adding a flag > to detect preemption, I used the same switchcnt test as is used by > sched_idletd(). These are the results of a ~9 hour poudriere run: > > kern.sched.steal.none: 9971668 # no threads were stolen > kern.sched.steal.fail: 23709 # unable to steal from cpu=sched_highest() > kern.sched.steal.level2: 191839 # somewhere on this chip > kern.sched.steal.level1: 557659 # a core on this CCX > kern.sched.steal.level0: 4555426 # the other SMT thread on this core > kern.sched.steal.restart: 404 # preemption detected so restart the search > kern.sched.steal.call: 15276638 # of times tdq_idled() called > > There are a few surprises here. > > One is the number of failed moves. I don't know if the load on the > source CPU fell below thresh, tdq_transferable went to zero, or if > tdq_move() failed. I also wonder if the failures are evenly distributed > across CPUs. It is possible that these failures are concentrated on CPU > 0, which handles most interrupts. If interrupts don't affect switchcnt, > then the data collected by sched_highest() could be a bit stale and we > would not know it. Most of the above failed moves were do to the either tdq_load dropping below the threshold or tdq_transferable going to zero. These are evenly distributed across CPUs that we want to steal from. I didn't not bin the results by which CPU this code was running on. Actual failures of tdq_move() are bursty and not evenly distributed across CPUs. I've created this review for my changes: https://reviews.freebsd.org/D12130 From owner-freebsd-arch@freebsd.org Sat Aug 26 18:12:11 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B491BDD761C for ; Sat, 26 Aug 2017 18:12:11 +0000 (UTC) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: from pdx.rh.CN85.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 70C56739B5; Sat, 26 Aug 2017 18:12:11 +0000 (UTC) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: from pdx.rh.CN85.dnsmgr.net (localhost [127.0.0.1]) by pdx.rh.CN85.dnsmgr.net (8.13.3/8.13.3) with ESMTP id v7QIC2ZB074444; Sat, 26 Aug 2017 11:12:02 -0700 (PDT) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: (from freebsd-rwg@localhost) by pdx.rh.CN85.dnsmgr.net (8.13.3/8.13.3/Submit) id v7QIC2eJ074443; Sat, 26 Aug 2017 11:12:02 -0700 (PDT) (envelope-from freebsd-rwg) From: "Rodney W. Grimes" Message-Id: <201708261812.v7QIC2eJ074443@pdx.rh.CN85.dnsmgr.net> Subject: Re: ULE steal_idle questions In-Reply-To: <20170826094725.G1648@besplex.bde.org> To: Bruce Evans Date: Sat, 26 Aug 2017 11:12:02 -0700 (PDT) CC: Don Lewis , avg@freebsd.org, freebsd-arch@freebsd.org X-Mailer: ELM [version 2.4ME+ PL121h (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 18:12:11 -0000 > On Fri, 25 Aug 2017, Don Lewis wrote: > > > ... > > Something else that I did not expect is the how frequently threads are > > stolen from the other SMT thread on the same core, even though I > > increased steal_thresh from 2 to 3 to account for the off-by-one > > problem. This is true even right after the system has booted and no > > significant load has been applied. My best guess is that because of > > affinity, both the parent and child processes run on the same CPU after > > fork(), and if a number of processes are forked() in quick succession, > > the run queue of that CPU can get really long. Forcing a thread > > migration in exec() might be a good solution. > > Since you are trying a lot of combinations, maybe you can tell us which > ones work best. SCHED_4BSD works better for me on an old 2-core system. > SCHED_ULE works better on a not-so old 4x2 core (Haswell) system, but I > don't like it due to its complexity. It makes differences of at most > +-2% except when mistuned it can give -5% for real time (but better for > CPU and presumably power). > > For SCHED_4BSD, I wrote fancy tuning for fork/exec and sometimes get > everything to like up for a 3% improvement (803 seconds instead of 823 > on the old system, with -current much slower at 840+ and old versions > of ULE before steal_idle taking 890+). This is very resource (mainly > cache associativity?) dependent and my tuning makes little difference > on the newer system. SCHED_ULE still has bugfeatures which tend to > help large builds by reducing context switching, e.g., by bogusly > clamping all CPU-bound threads to nearly maximal priority. That last bugfeature is probably what makes current systems interactive performance tank rather badly when under heavy loads. Would it be hard to fix? -- Rod Grimes rgrimes@freebsd.org From owner-freebsd-arch@freebsd.org Sat Aug 26 18:18:18 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AEB0BDD786F for ; Sat, 26 Aug 2017 18:18:18 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from outbound1b.ore.mailhop.org (outbound1b.ore.mailhop.org [54.200.247.200]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8D21A73EA3 for ; Sat, 26 Aug 2017 18:18:18 +0000 (UTC) (envelope-from ian@freebsd.org) X-MHO-User: f1f4ae47-8a8a-11e7-950d-03a3531dacf2 X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 73.78.92.27 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [73.78.92.27]) by outbound1.ore.mailhop.org (Halon) with ESMTPSA id f1f4ae47-8a8a-11e7-950d-03a3531dacf2; Sat, 26 Aug 2017 18:18:23 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.15.2) with ESMTP id v7QIIEmO006472; Sat, 26 Aug 2017 12:18:14 -0600 (MDT) (envelope-from ian@freebsd.org) Message-ID: <1503771494.56799.49.camel@freebsd.org> Subject: Re: ULE steal_idle questions From: Ian Lepore To: "Rodney W. Grimes" , Bruce Evans Cc: Don Lewis , avg@freebsd.org, freebsd-arch@freebsd.org Date: Sat, 26 Aug 2017 12:18:14 -0600 In-Reply-To: <201708261812.v7QIC2eJ074443@pdx.rh.CN85.dnsmgr.net> References: <201708261812.v7QIC2eJ074443@pdx.rh.CN85.dnsmgr.net> Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: Evolution 3.18.5.1 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 18:18:18 -0000 On Sat, 2017-08-26 at 11:12 -0700, Rodney W. Grimes wrote: > > > > On Fri, 25 Aug 2017, Don Lewis wrote: > > > > > > > > ... > > > Something else that I did not expect is the how frequently > > > threads are > > > stolen from the other SMT thread on the same core, even though I > > > increased steal_thresh from 2 to 3 to account for the off-by-one > > > problem.  This is true even right after the system has booted and > > > no > > > significant load has been applied.  My best guess is that because > > > of > > > affinity, both the parent and child processes run on the same CPU > > > after > > > fork(), and if a number of processes are forked() in quick > > > succession, > > > the run queue of that CPU can get really long.  Forcing a thread > > > migration in exec() might be a good solution. > > Since you are trying a lot of combinations, maybe you can tell us > > which > > ones work best.  SCHED_4BSD works better for me on an old 2-core > > system. > > SCHED_ULE works better on a not-so old 4x2 core (Haswell) system, > > but I  > > don't like it due to its complexity.  It makes differences of at > > most > > +-2% except when mistuned it can give -5% for real time (but better > > for > > CPU and presumably power). > > > > For SCHED_4BSD, I wrote fancy tuning for fork/exec and sometimes > > get > > everything to like up for a 3% improvement (803 seconds instead of > > 823 > > on the old system, with -current much slower at 840+ and old > > versions > > of ULE before steal_idle taking 890+).  This is very resource > > (mainly > > cache associativity?) dependent and my tuning makes little > > difference > > on the newer system.  SCHED_ULE still has bugfeatures which tend to > > help large builds by reducing context switching, e.g., by bogusly > > clamping all CPU-bound threads to nearly maximal priority. > That last bugfeature is probably what makes current systems > interactive performance tank rather badly when under heavy > loads.  Would it be hard to fix? > I would second that sentiment... as time goes on, heavily loaded systems seem to become less and less interactive-friendly.  Also, running the heavy-load jobs such as builds with nice, even -n 20, doesn't seem to make any noticible difference in terms of making un- nice'd processes more responsive (not sure there's any relationship in the underlying causes of that, though). -- Ian From owner-freebsd-arch@freebsd.org Sat Aug 26 18:29:39 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6C644DD7D7E for ; Sat, 26 Aug 2017 18:29:39 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 2CA7D74633; Sat, 26 Aug 2017 18:29:39 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7QITTYw053896; Sat, 26 Aug 2017 11:29:33 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708261829.v7QITTYw053896@gw.catspoiler.org> Date: Sat, 26 Aug 2017 11:29:29 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: freebsd-rwg@pdx.rh.CN85.dnsmgr.net cc: brde@optusnet.com.au, avg@freebsd.org, freebsd-arch@freebsd.org In-Reply-To: <201708261812.v7QIC2eJ074443@pdx.rh.CN85.dnsmgr.net> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 18:29:39 -0000 On 26 Aug, Rodney W. Grimes wrote: >> On Fri, 25 Aug 2017, Don Lewis wrote: >> >> > ... >> > Something else that I did not expect is the how frequently threads are >> > stolen from the other SMT thread on the same core, even though I >> > increased steal_thresh from 2 to 3 to account for the off-by-one >> > problem. This is true even right after the system has booted and no >> > significant load has been applied. My best guess is that because of >> > affinity, both the parent and child processes run on the same CPU after >> > fork(), and if a number of processes are forked() in quick succession, >> > the run queue of that CPU can get really long. Forcing a thread >> > migration in exec() might be a good solution. >> >> Since you are trying a lot of combinations, maybe you can tell us which >> ones work best. SCHED_4BSD works better for me on an old 2-core system. >> SCHED_ULE works better on a not-so old 4x2 core (Haswell) system, but I >> don't like it due to its complexity. It makes differences of at most >> +-2% except when mistuned it can give -5% for real time (but better for >> CPU and presumably power). >> >> For SCHED_4BSD, I wrote fancy tuning for fork/exec and sometimes get >> everything to like up for a 3% improvement (803 seconds instead of 823 >> on the old system, with -current much slower at 840+ and old versions >> of ULE before steal_idle taking 890+). This is very resource (mainly >> cache associativity?) dependent and my tuning makes little difference >> on the newer system. SCHED_ULE still has bugfeatures which tend to >> help large builds by reducing context switching, e.g., by bogusly >> clamping all CPU-bound threads to nearly maximal priority. > > That last bugfeature is probably what makes current systems > interactive performance tank rather badly when under heavy > loads. Would it be hard to fix? I actually haven't noticed that problem on my package build boxes. I've experienced decent interactive performance even when the load average is in the 60 to 80 range. I also have poudriere configured to use tmpfs and the only issue I run into is when it starts getting heavily into swap (like 20G) and I leave my session idle for a while, which lets my shell and sshd get swapped out. Then it takes them a while to wake up again. Once they are paged in, then things feel snappy again. This is remote access, so I can't comment on what X11 feels like. From owner-freebsd-arch@freebsd.org Sat Aug 26 18:46:55 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 82EA4DD87D5 for ; Sat, 26 Aug 2017 18:46:55 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1A4F175AD5; Sat, 26 Aug 2017 18:46:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v7QIkoiL073907 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 26 Aug 2017 21:46:50 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v7QIkoiL073907 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v7QIkogv073906; Sat, 26 Aug 2017 21:46:50 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 26 Aug 2017 21:46:50 +0300 From: Konstantin Belousov To: Don Lewis Cc: freebsd-rwg@pdx.rh.CN85.dnsmgr.net, avg@freebsd.org, freebsd-arch@freebsd.org Subject: Re: ULE steal_idle questions Message-ID: <20170826184650.GS1700@kib.kiev.ua> References: <201708261812.v7QIC2eJ074443@pdx.rh.CN85.dnsmgr.net> <201708261829.v7QITTYw053896@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201708261829.v7QITTYw053896@gw.catspoiler.org> User-Agent: Mutt/1.8.3 (2017-05-23) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 18:46:55 -0000 On Sat, Aug 26, 2017 at 11:29:29AM -0700, Don Lewis wrote: > I actually haven't noticed that problem on my package build boxes. I've > experienced decent interactive performance even when the load average is > in the 60 to 80 range. I also have poudriere configured to use tmpfs > and the only issue I run into is when it starts getting heavily into > swap (like 20G) and I leave my session idle for a while, which lets my > shell and sshd get swapped out. Then it takes them a while to wake up > again. Once they are paged in, then things feel snappy again. This is > remote access, so I can't comment on what X11 feels like. I believe what people complain about is the following scenario: they have some interactive long living process, say firefox or mplayer. The process' threads consume CPU cycles, so the ULE interactivity detection logic actually classifies the threads as non-interactive. This is not much problematic until a parallel build starts where toolchain processes are typically short-lived. This makes them classified as interactive, and their dynamic priority are lower than the priority of long-lived threads which are interactive by user perception. I did not analyzed the KTR dumps but this explanation more or less coincides with the system slugginess when attempt to use mplayer while heavily oversubscribed build (e.g. make -j 10 on 4 cores x 2 SMT machine) is started. From owner-freebsd-arch@freebsd.org Sat Aug 26 19:47:52 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2308DDD9A2A for ; Sat, 26 Aug 2017 19:47:52 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E965C77371; Sat, 26 Aug 2017 19:47:51 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7QJle5Q054291; Sat, 26 Aug 2017 12:47:44 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708261947.v7QJle5Q054291@gw.catspoiler.org> Date: Sat, 26 Aug 2017 12:47:40 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: kostikbel@gmail.com cc: freebsd-rwg@pdx.rh.CN85.dnsmgr.net, avg@freebsd.org, freebsd-arch@freebsd.org In-Reply-To: <20170826184650.GS1700@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 19:47:52 -0000 On 26 Aug, Konstantin Belousov wrote: > On Sat, Aug 26, 2017 at 11:29:29AM -0700, Don Lewis wrote: >> I actually haven't noticed that problem on my package build boxes. I've >> experienced decent interactive performance even when the load average is >> in the 60 to 80 range. I also have poudriere configured to use tmpfs >> and the only issue I run into is when it starts getting heavily into >> swap (like 20G) and I leave my session idle for a while, which lets my >> shell and sshd get swapped out. Then it takes them a while to wake up >> again. Once they are paged in, then things feel snappy again. This is >> remote access, so I can't comment on what X11 feels like. > > I believe what people complain about is the following scenario: > they have some interactive long living process, say firefox or mplayer. > The process' threads consume CPU cycles, so the ULE interactivity > detection logic actually classifies the threads as non-interactive. > > This is not much problematic until a parallel build starts where > toolchain processes are typically short-lived. This makes them > classified as interactive, and their dynamic priority are lower than the > priority of long-lived threads which are interactive by user perception. > > I did not analyzed the KTR dumps but this explanation more or less > coincides with the system slugginess when attempt to use mplayer while > heavily oversubscribed build (e.g. make -j 10 on 4 cores x 2 SMT > machine) is started. I can believe that. I keep an excessive number of tabs open in firefox and it would frequenty get into a state where it would consume 100% of a CPU core. Very recent versions of firefox are a lot better. Xorg is another possible victim. I've just noticed that when certain windows have mouse focus (firefox being one, wish-based apps are another) that the Xorg %CPU goes to 80%-90%. I think this crept in with the lastest MATE upgrade. If Xorg is treated as non-interactive, then the desktop experience is going to be less than optimal if there is competing load. From owner-freebsd-arch@freebsd.org Sat Aug 26 19:58:48 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 301B6DD9C8E for ; Sat, 26 Aug 2017 19:58:48 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E5733776DE; Sat, 26 Aug 2017 19:58:47 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7QJwbGK054320; Sat, 26 Aug 2017 12:58:41 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708261958.v7QJwbGK054320@gw.catspoiler.org> Date: Sat, 26 Aug 2017 12:58:37 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: kostikbel@gmail.com cc: freebsd-rwg@pdx.rh.CN85.dnsmgr.net, avg@freebsd.org, freebsd-arch@freebsd.org In-Reply-To: <201708261947.v7QJle5Q054291@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 19:58:48 -0000 On 26 Aug, To: kostikbel@gmail.com wrote: > On 26 Aug, Konstantin Belousov wrote: >> On Sat, Aug 26, 2017 at 11:29:29AM -0700, Don Lewis wrote: >>> I actually haven't noticed that problem on my package build boxes. I've >>> experienced decent interactive performance even when the load average is >>> in the 60 to 80 range. I also have poudriere configured to use tmpfs >>> and the only issue I run into is when it starts getting heavily into >>> swap (like 20G) and I leave my session idle for a while, which lets my >>> shell and sshd get swapped out. Then it takes them a while to wake up >>> again. Once they are paged in, then things feel snappy again. This is >>> remote access, so I can't comment on what X11 feels like. >> >> I believe what people complain about is the following scenario: >> they have some interactive long living process, say firefox or mplayer. >> The process' threads consume CPU cycles, so the ULE interactivity >> detection logic actually classifies the threads as non-interactive. >> >> This is not much problematic until a parallel build starts where >> toolchain processes are typically short-lived. This makes them >> classified as interactive, and their dynamic priority are lower than the >> priority of long-lived threads which are interactive by user perception. >> >> I did not analyzed the KTR dumps but this explanation more or less >> coincides with the system slugginess when attempt to use mplayer while >> heavily oversubscribed build (e.g. make -j 10 on 4 cores x 2 SMT >> machine) is started. > > I can believe that. I keep an excessive number of tabs open in firefox > and it would frequenty get into a state where it would consume 100% of a > CPU core. Very recent versions of firefox are a lot better. > > Xorg is another possible victim. I've just noticed that when certain > windows have mouse focus (firefox being one, wish-based apps are > another) that the Xorg %CPU goes to 80%-90%. I think this crept in with > the lastest MATE upgrade. If Xorg is treated as non-interactive, then > the desktop experience is going to be less than optimal if there is > competing load. I've got poudriere running right now on my primary package build box. The priorties of the compiler processes are currently in the range of 74-96. On my desktop, firefox is running at priority 24. Xorg when it is not being a CPU hog gets all the way down to priority 20. When the mouse is pointing to one of the windows that makes it go nuts, then it gets all the way up to priority 98.