Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Mar 2023 11:14:11 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Peter <pmc@citylink.dinoex.sub.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: Periodic rant about SCHED_ULE
Message-ID:  <5AF26266-5B4C-4A7F-8784-4C6308B6C5CA@yahoo.com>
References:  <5AF26266-5B4C-4A7F-8784-4C6308B6C5CA.ref@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Peter <pmc_at_citylink.dinoex.sub.org> wrote on
Date: Sat, 25 Mar 2023 15:47:42 UTC :

> Quoting George Mitchell <george+freebsd@m5p.com>:
>=20
> >> =
https://forums.freebsd.org/threads/what-is-sysctl-kern-sched-preempt_thres=
h.85
> >>
> >Thank you! -- George
>=20
> You're welcome. Can I get a success/failure report?
>=20
>=20
> ---------------------------------------------------------------------
> >> On 3/22/23, Steve Kargl <sgk@troutmask.apl.washington.edu> wrote:
> >>>
> >>> I reported the issue with ULE some 15 to 20 years ago.
>=20
> Can I get the PR number, please?
>=20
>=20
> ---------------------------------------------------------------------
> Test usecase:
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
> Create two compute tasks competing for the same -otherwise unused- =
core,=20
> one without, one with syscalls:=20
>=20
> # cpuset -l 13 sh -c "while true; do :; done" &=20
> # tar cvf - / | cpuset -l 13 gzip -9 > /dev/null=20
>=20
> Within a few seconds the two task are balanced, running at nearly the=20=

> same PRI and using each 50% of the core:=20
>=20
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND=20
> 5166 root 1 88 0 13M 3264K RUN 13 9:23 51.65% sh=20
> 10675 root 1 87 0 13M 3740K CPU13 13 1:30 48.57% gzip=20
>=20
> This changes when the tar reaches /usr/include with it's many small=20
> files. Now smaller blocks are delivered to gzip, it does more=20
> syscalls, and things get ugly:=20
>=20
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND=20
> 5166 root 1 94 0 13M 3264K RUN 13 18:07 95.10% sh=20
> 19028 root 1 81 0 13M 3740K CPU13 13 1:23 4.87% gzip=20

Why did PID 10675 change to 19028?

> This does not happen because tar would be slow in moving data to=20
> gzip: tar reads from SSD, or more likely from ARC, and this is=20
> always faster than gzip-9. The imbalance is made by the scheduler.


When I tried that tar line, I get lots of output to stderr:

# tar cvf - / | cpuset -l 13 gzip -9 > /dev/null
tar: Removing leading '/' from member names
a .
a root
a wrkdirs
a bin
a usr
. . .

Was that an intentional part of the test?

To avoid this I used:

# tar cvf - / 2>/dev/null | cpuset -l 13 gzip -9 2>&1 > /dev/null

At which point I get the likes of:

17129 root          1  68    0  14192Ki    3628Ki RUN     13   0:20   =
3.95% gzip -9
17128 root          1  20    0  58300Ki   13880Ki pipdwt  18   0:00   =
0.27% tar cvf - / (bsdtar)
17097 root          1 133    0  13364Ki    3060Ki CPU13   13   8:05  =
95.93% sh -c while true; do :; done

up front.

For reference, I also see the likes of the following from
"gstat -spod" (it is a root on ZFS context with PCIe Optane media):

dT: 1.063s  w: 1.000s
 L(q)  ops/s    r/s     kB   kBps   ms/r    w/s     kB   kBps   ms/w    =
d/s     kB   kBps   ms/d    o/s   ms/o   %busy Name
. . .
    0     68     68     14    937    0.0      0      0      0    0.0     =
 0      0      0    0.0      0    0.0    0.1| nvd2
. . .



=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5AF26266-5B4C-4A7F-8784-4C6308B6C5CA>