From owner-freebsd-arch@freebsd.org Sat Aug 26 18:29:39 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6C644DD7D7E for ; Sat, 26 Aug 2017 18:29:39 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 2CA7D74633; Sat, 26 Aug 2017 18:29:39 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v7QITTYw053896; Sat, 26 Aug 2017 11:29:33 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201708261829.v7QITTYw053896@gw.catspoiler.org> Date: Sat, 26 Aug 2017 11:29:29 -0700 (PDT) From: Don Lewis Subject: Re: ULE steal_idle questions To: freebsd-rwg@pdx.rh.CN85.dnsmgr.net cc: brde@optusnet.com.au, avg@freebsd.org, freebsd-arch@freebsd.org In-Reply-To: <201708261812.v7QIC2eJ074443@pdx.rh.CN85.dnsmgr.net> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Aug 2017 18:29:39 -0000 On 26 Aug, Rodney W. Grimes wrote: >> On Fri, 25 Aug 2017, Don Lewis wrote: >> >> > ... >> > Something else that I did not expect is the how frequently threads are >> > stolen from the other SMT thread on the same core, even though I >> > increased steal_thresh from 2 to 3 to account for the off-by-one >> > problem. This is true even right after the system has booted and no >> > significant load has been applied. My best guess is that because of >> > affinity, both the parent and child processes run on the same CPU after >> > fork(), and if a number of processes are forked() in quick succession, >> > the run queue of that CPU can get really long. Forcing a thread >> > migration in exec() might be a good solution. >> >> Since you are trying a lot of combinations, maybe you can tell us which >> ones work best. SCHED_4BSD works better for me on an old 2-core system. >> SCHED_ULE works better on a not-so old 4x2 core (Haswell) system, but I >> don't like it due to its complexity. It makes differences of at most >> +-2% except when mistuned it can give -5% for real time (but better for >> CPU and presumably power). >> >> For SCHED_4BSD, I wrote fancy tuning for fork/exec and sometimes get >> everything to like up for a 3% improvement (803 seconds instead of 823 >> on the old system, with -current much slower at 840+ and old versions >> of ULE before steal_idle taking 890+). This is very resource (mainly >> cache associativity?) dependent and my tuning makes little difference >> on the newer system. SCHED_ULE still has bugfeatures which tend to >> help large builds by reducing context switching, e.g., by bogusly >> clamping all CPU-bound threads to nearly maximal priority. > > That last bugfeature is probably what makes current systems > interactive performance tank rather badly when under heavy > loads. Would it be hard to fix? I actually haven't noticed that problem on my package build boxes. I've experienced decent interactive performance even when the load average is in the 60 to 80 range. I also have poudriere configured to use tmpfs and the only issue I run into is when it starts getting heavily into swap (like 20G) and I leave my session idle for a while, which lets my shell and sshd get swapped out. Then it takes them a while to wake up again. Once they are paged in, then things feel snappy again. This is remote access, so I can't comment on what X11 feels like.