Date: Fri, 31 Mar 2023 14:33:16 -0700 (PDT) From: Jeff Roberson <jroberson@jroberson.net> To: freebsd-hackers@freebsd.org Subject: Re: ULE process to resolution Message-ID: <11380305-6261-6c08-fd15-299e695fa342@jroberson.net> In-Reply-To: <a6066590-0b4d-b332-102a-9c2432cdfec6@jroberson.net> References: <a6066590-0b4d-b332-102a-9c2432cdfec6@jroberson.net>
next in thread | previous in thread | raw e-mail | index | archive | help
I found an old patch of mine that addresses some of the issues with rapid sleeping/waking batch processes here: https://reviews.freebsd.org/D15985 Seems there are some bits relevant to behavior described earlier on hackers@. I was not subscribed to this list so I can't reply to the specific message. Jeff On Fri, 31 Mar 2023, Jeff Roberson wrote: > Hi Folks, > > For those who don't know, I am the original author of ULE. I have not had > much time for FreeBSD in recent years but this thread was forwarded to me and > I am dishearetened at the state of things. I will give my perspective and > propose a path to resolve this systematically. > > The fundamental benefit of ULE is also the fundamental challenge, That is: N > cpu local decisions need to add up to a reasonable approximation of a correct > global decision. This is necessary to scale to large core counts, large > thread counts, and preserve some affinity. You could permute 4BSD further > towards these goals but I posit that you would simply have to work through > the same bugs. > > As I read these threads I can state with a high degree of confidence that > many of these tests worked with superior results with ULE at one time. It may > be that tradeoffs have changed or exposed weaknesses, it may also be that > it's simply been broken over time. I see a large number of commits intended > to address point issues and wonder whether we adequately explored the > consquences. Indeed I see solutions involving tunables proposed here that > will definitively break other cases. > > I know that CPU tradeoffs have changed. ULE was written in a way that the > topology could be annotated and cost of migration can be specified. It is > adaptable to this but someone has to put in the effort. The cost function > was written in ticks which does not scale down properly and accurate cpu tick > counters could now be used for more precise time-keeping for more specific > affinity. Over time people have also added additional searches to pickcpu > which don't scale well to very high core count systems. NUMA and > heterogeneous CPUs are also possible in the graph framework but need further > investment. > > The other thing that has changed over time is the ability of the > interactivity score to correctly detect truely interactive applications. When > I wrote it you could do a buildworld on a single core or small multi-core > system and play mp3s and browse the web without a hiccup. However, web > browsers have evolved to be significantly more resource intensive. I'm not > sure a heuristic can or should catch this case. We're probably long overdue > to add x window focus hints as most other operating systems do. I don't > think tossing the interactivity score is really going to produce the desired > results. Linux CFS disagrees with me but I have always been able to achieve > superior responsiveness with ULE. My intuition is that with an x window focus > hint we could dial back the interactive threshold and have better tradeoffs > with the soft real-time score. > > schedgraph is also no longer adequate for modern systems. In my professional > life I have taken the same types of data sources and built text based > processes on top because graphical representations just can't scale to the > number of events and cores for full system scheduling. For complex > scheduling issues you need detailed introspection. You're not going to tweak > variables and run buildworlds to arrive at success by supposition with any > kind of reasonable velocity. > > The first step to resolving this is to come up with a list of regression > tests and catalog how they behave compared to 4BSD. When I wrote the > scheduler I also wrote a simple fixed duty cycle program that could be run > with different scheduling parameters and report on its cpu usage and latency. > Combining many copies of this program you can simulate various kinds of > interactions. It is available at people.freebsd.org/~jeff/late.tgz. I know > there is also a linux scheduler benchmark that may be worth porting. > > If someone would start making regression tests I am happy to fix bugs or > review bug fixes. Personally I would start from fairness given different > nice values on a single CPU, and then multi-cpu. Evaluate allocation with > variation on load to core count ratios. It should not take a few hours to > iterate through the interesting cases here before going on to more complex > questions about buildworld or firefox etc. This would need to be something > we carried forward in the source tree and ask people to re-run as part of > scheduler CRs or we're just going to find ourselves back in this spot again. > > I also have a backlog of improvements for large multi-core systems from work > I did years ago that have not made it into the tree. And I have an old > review for patches to improve the reliability of priority in causing > scheduling events that may be germane. If we can collaborate on a testing > framework I could trickle these in. > > Thanks, > Jeff >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?11380305-6261-6c08-fd15-299e695fa342>