Date: Sat, 25 Mar 2023 11:23:04 -0700 From: Mark Millard <marklmi@yahoo.com> To: Peter <pmc@citylink.dinoex.sub.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: Periodic rant about SCHED_ULE Message-ID: <76DAACBB-C865-4779-A340-D66C35D610B4@yahoo.com> In-Reply-To: <5AF26266-5B4C-4A7F-8784-4C6308B6C5CA@yahoo.com> References: <5AF26266-5B4C-4A7F-8784-4C6308B6C5CA@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mar 25, 2023, at 11:14, Mark Millard <marklmi@yahoo.com> wrote: > Peter <pmc_at_citylink.dinoex.sub.org> wrote on > Date: Sat, 25 Mar 2023 15:47:42 UTC : >=20 >> Quoting George Mitchell <george+freebsd@m5p.com>: >>=20 >>>> = https://forums.freebsd.org/threads/what-is-sysctl-kern-sched-preempt_thres= h.85 >>>>=20 >>> Thank you! -- George >>=20 >> You're welcome. Can I get a success/failure report? >>=20 >>=20 >> --------------------------------------------------------------------- >>>> On 3/22/23, Steve Kargl <sgk@troutmask.apl.washington.edu> wrote: >>>>>=20 >>>>> I reported the issue with ULE some 15 to 20 years ago. >>=20 >> Can I get the PR number, please? >>=20 >>=20 >> --------------------------------------------------------------------- >> Test usecase: >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>=20 >> Create two compute tasks competing for the same -otherwise unused- = core,=20 >> one without, one with syscalls:=20 >>=20 >> # cpuset -l 13 sh -c "while true; do :; done" &=20 >> # tar cvf - / | cpuset -l 13 gzip -9 > /dev/null=20 >>=20 >> Within a few seconds the two task are balanced, running at nearly the=20= >> same PRI and using each 50% of the core:=20 >>=20 >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND=20 >> 5166 root 1 88 0 13M 3264K RUN 13 9:23 51.65% sh=20 >> 10675 root 1 87 0 13M 3740K CPU13 13 1:30 48.57% gzip=20 >>=20 >> This changes when the tar reaches /usr/include with it's many small=20= >> files. Now smaller blocks are delivered to gzip, it does more=20 >> syscalls, and things get ugly:=20 >>=20 >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND=20 >> 5166 root 1 94 0 13M 3264K RUN 13 18:07 95.10% sh=20 >> 19028 root 1 81 0 13M 3740K CPU13 13 1:23 4.87% gzip=20 >=20 > Why did PID 10675 change to 19028? >=20 >> This does not happen because tar would be slow in moving data to=20 >> gzip: tar reads from SSD, or more likely from ARC, and this is=20 >> always faster than gzip-9. The imbalance is made by the scheduler. >=20 >=20 > When I tried that tar line, I get lots of output to stderr: >=20 > # tar cvf - / | cpuset -l 13 gzip -9 > /dev/null > tar: Removing leading '/' from member names > a . > a root > a wrkdirs > a bin > a usr > . . . >=20 > Was that an intentional part of the test? >=20 > To avoid this I used: >=20 > # tar cvf - / 2>/dev/null | cpuset -l 13 gzip -9 2>&1 > /dev/null >=20 > At which point I get the likes of: >=20 > 17129 root 1 68 0 14192Ki 3628Ki RUN 13 0:20 = 3.95% gzip -9 > 17128 root 1 20 0 58300Ki 13880Ki pipdwt 18 0:00 = 0.27% tar cvf - / (bsdtar) > 17097 root 1 133 0 13364Ki 3060Ki CPU13 13 8:05 = 95.93% sh -c while true; do :; done >=20 > up front. >=20 > For reference, I also see the likes of the following from > "gstat -spod" (it is a root on ZFS context with PCIe Optane media): >=20 > dT: 1.063s w: 1.000s > L(q) ops/s r/s kB kBps ms/r w/s kB kBps ms/w = d/s kB kBps ms/d o/s ms/o %busy Name > . . . > 0 68 68 14 937 0.0 0 0 0 0.0 = 0 0 0 0.0 0 0.0 0.1| nvd2 > . . . >=20 >=20 I left it running and I'm now seeing: 17129 root 1 107 0 14192Ki 3628Ki CPU13 13 3:01 = 48.10% gzip -9 17128 root 1 21 0 58300Ki 15428Ki pipdwt 20 0:04 = 2.02% tar cvf - / (bsdtar) 17097 root 1 115 0 13364Ki 3060Ki RUN 13 16:30 = 51.77% sh -c while true; do :; done Also examples of the likes of: dT: 1.063s w: 1.000s L(q) ops/s r/s kB kBps ms/r w/s kB kBps ms/w = d/s kB kBps ms/d o/s ms/o %busy Name . . . 0 1213 1213 5 6456 0.0 0 0 0 0.0 = 0 0 0 0.0 0 0.0 1.2| nvd2 . . . FYI: ThreadRipper 1950X context. Looks like what I'll see is very dependent on when I look at what it is doing: the details involved matter. =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?76DAACBB-C865-4779-A340-D66C35D610B4>