Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Mar 2023 12:05:01 -0700
From:      Kevin Bowling <kevin.bowling@kev009.com>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Periodic rant about SCHED_ULE
Message-ID:  <CAK7dMtAG%2BeTTMijH71yFoJ%2BmeFwc32fnWvLp2R56x%2B1azjVfPg@mail.gmail.com>
In-Reply-To: <CAGudoHGNRQudBUj-XCNb=AirOgWn%2Bs0mi96v3axK4ZZSSdybZA@mail.gmail.com>
References:  <a401e51a-250a-64a0-15cb-ff79bcefbf94@m5p.com> <8173cc7e-e934-dd5c-312a-1dfa886941aa@FreeBSD.org> <8cfdb951-9b1f-ecd3-2291-7a528e1b042c@m5p.com> <c3f5f667-ba0b-c40c-b8a6-19d1c9c63c5f@FreeBSD.org> <ZBtRJhNHluj5Nzyk@troutmask.apl.washington.edu> <CAGudoHEj%2BkoaYhkjzDE5KX9OsCno=X5M_E3z9uwg6Pg7dtqTsA@mail.gmail.com> <CAGudoHHxTT-Cn11zcFB3ZwF76UcRUv=QS28RLgzd=hVehTy0Kg@mail.gmail.com> <CAGudoHGoh30O-3O0jjwevDvP43-ykUt6JUDiwRNW918VZfybhA@mail.gmail.com> <CAGudoHEWfy61XSMhXdYOrKWVotuC0Kc6NSWiaaZCy6aQhbvXoQ@mail.gmail.com> <CAGudoHFPqz_LtsVNnz4P2gyKXz5Z8hU%2Bv6QYGizm4%2BDtZRn8Yg@mail.gmail.com> <CAGudoHGzBjXjXZFs%2BqZJUS-M6VeX5=LB2ifRLP7hFBZXPvqP7g@mail.gmail.com> <CAK7dMtAsBehP2cy6cn31Z%2BSo6T2Q_mpN5ibEmYMNPOkWQHk8FA@mail.gmail.com> <CAK7dMtAHOpq579Hb_Ar3e-VaYZGfawAxBMDoM0Zt0T_=UD5Jgw@mail.gmail.com> <CAGudoHGNRQudBUj-XCNb=AirOgWn%2Bs0mi96v3axK4ZZSSdybZA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 30, 2023 at 11:50=E2=80=AFAM Mateusz Guzik <mjguzik@gmail.com> =
wrote:
>
> On 3/30/23, Kevin Bowling <kevin.bowling@kev009.com> wrote:
> > On Thu, Mar 30, 2023 at 11:29=E2=80=AFAM Kevin Bowling <kevin.bowling@k=
ev009.com>
> > wrote:
> >>
> >> On Thu, Mar 30, 2023 at 8:37=E2=80=AFAM Mateusz Guzik <mjguzik@gmail.c=
om> wrote:
> >> >
> >> > I looked into it a little more, below you can find summary and steps
> >> > forward.
> >> >
> >> > First a general statement: while ULE does have performance bugs, it
> >> > has better basis than 4BSD to make scheduling decisions. Most notabl=
y
> >> > it understands CPU topology, at least for cases which don't involve
> >> > big.LITTLE. For any non-freak case where 4BSD performs better, it is=
 a
> >> > bug in ULE if this is for any reason other than a tradeoff which can
> >> > be tweaked to line them up. Or more to the point, there should not b=
e
> >> > any legitimate reason to use 4BSD these days and modulo the bugs
> >> > below, you are probably losing on performance for doing so.
> >>
> >> An elided simple algorithm for big.LITTLE, from Larry McVoy.. if you
> >> run for an entire quantum, flag preference for big core.  If you run
> >> for less or get punted off, flag for little core preference.
> >>
> >> > Bugs reported in this thread by others and confirmed by me:
> >> > 1. failure to load-balance when having n CPUs and n + 1 workers -- t=
he
> >> > excess one stays on one the same CPU thread continuously penalizing
> >> > the same victim. as a result total real time to execute a finite
> >> > computation is longer than in the case of 4BSD
> >> > 2. unfairness of nice -n 20 threads vs threads going frequently off
> >> > CPU (e.g., due to I/O) -- after using only a fraction of the slice t=
he
> >> > victim has to wait for the cpu hog to use up its entire slice, rinse
> >> > and repeat. This extends a 7+ minute buildkernel to over 67 minutes,
> >> > not an issue on 4BSD
> >> >
> >> > I did not put almost any effort into investigating no 1. There is co=
de
> >> > which is supposed to rebalance load across CPUs, someone(tm) will ha=
ve
> >> > to sit through it -- for all I know the fix is trivial.
> >> >
> >> > Fixing number 2 makes *another* bug more acute and it complicates th=
e
> >> > whole ordeal.
> >> >
> >> > Thus, bug reported by me:
> >> > 3. interactivity scoring is bogus -- originally introduced to detect
> >> > "interactive" behavior by equating being off CPU with waiting for us=
er
> >> > input. One part of the problem is that it puts *all* non-preempted o=
ff
> >> > CPU time into one bag: a voluntary sleep. This includes suffering fr=
om
> >> > lock contention in the kernel, lock contention in the program itself=
,
> >> > file I/O and so on, none of which has bearing on how interactive or
> >> > not the program might happen to be. A bigger part of the problem is
> >> > that at least today, the graphical programs don't even act this way =
to
> >> > begin with -- they stay on CPU *a lot*.
> >> >
> >> > I asked people to provide me with the output of: dtrace -n
> >> > 'sched:::on-cpu { @[execname] =3D lquantize(curthread->td_priority, =
0,
> >> > 224, 1); }' from their laptops/desktops.
> >> >
> >> > One finding is that most people (at least those who reported) use
> >> > firefox.
> >> >
> >> > Another finding is that the browser is above the threshold which wou=
ld
> >> > be considered "interactive" for vast majority of the time in all
> >> > reported cases.
> >> >
> >> > I booted a 2 thread vm with xfce and decided to click around. Spawne=
d
> >> > firefox, opened a file manager (Thunar) and from there I opened a
> >> > movie to play with mpv. As root I spawned make -j 2 buildkernel. it
> >> > was not particularly good :)
> >> >
> >> > I found that mpv spawns a bunch of threads, most notably 2 distinct
> >> > threads for audio and video output. The one for video got a priority
> >> > of 175, while the rest had either 88 or 89 -- the lowest for
> >> > timesharing not considered interactive [note lower is considered
> >> > better].
> >> >
> >> > At the same time the file manager who was left in the background kep=
t
> >> > doing evil syscall usage, which as a result bouncing between a regul=
ar
> >> > timesharing priority and one which made it "interactive", even thoug=
h
> >> > the program was not touched for minutes.
> >> >
> >> > Or to put it differently, the scheduler failed to recognize that mpv
> >> > is the program to prioritize, all while thinking the background time
> >> > waster is the thing to look after (so to speak).
> >> >
> >> > This brings us to fixing problem 2: currently, due to the existence =
of
> >> > said problem, the interactivity scoring woes are less acute -- the
> >> > venerable make -j example is struggling to get CPU time, as a result
> >> > messing with real interactive programs to a lesser extent. If that
> >> > gets fixed, we are in a different boat altogether.
> >> >
> >> > I don't see a clean solution.
> >
> > One other random anecdote.  Windows 11 uses window focus to highly
> > boost scheduling priority in an obviously effective way.  I have no
> > idea how difficult something like that would be to fit into the unix
> > world.
> >
>
> I thought about doing something like that, but I consider it dodgy.
> Imagine you play some crap from youtube while messing around in a text
> editor -- I'm pretty sure the former is more prone to disturbance from
> scheduling changes.
>
> Anyhow after sending the above e-mail an actual solution hit me: the X
> server can tell the kernel what processes connect to it over the unix
> socket, which again very well may be good enough.
>
> In the reports I got I found pulseaudio, this one may need to be
> patched in a similar manner.

Yeah that seems like an easier problem, IMO something like a userspace
audio server (or its init script) should be in charge of setting it to
RT.

> --
> Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAK7dMtAG%2BeTTMijH71yFoJ%2BmeFwc32fnWvLp2R56x%2B1azjVfPg>