Date: Mon, 19 Jul 2021 01:37:43 +0100 From: RW via freebsd-hackers <freebsd-hackers@freebsd.org> To: freebsd-hackers@freebsd.org Subject: Re: Periodic rant about SCHED_ULE Message-ID: <20210719013743.0590b1f2@gumby.homeunix.com> In-Reply-To: <8239e474-fc36-b8aa-93b7-39197534cd30@heuristicsystems.com.au> References: <13445948-7804-20b4-4ae6-aaac14d11e87@m5p.com> <20210708101907.0be3a3c2@rimwks.local> <20210714164745.0128ea15@gumby.homeunix.com> <b24a2124-fd4a-0ae6-944e-4d39d590794c@heuristicsystems.com.au> <8239e474-fc36-b8aa-93b7-39197534cd30@heuristicsystems.com.au>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 15 Jul 2021 11:03:04 +1000 Dewayne Geraghty wrote: > On 15/07/2021 1:47 am, RW via freebsd-hackers wrote: > > kern.sched.preempt_thresh=3D224 > > ... > > I think the default only allows preemption by real-time and kernel > > threads.=20 > > =20 > Hi RW,=C2=A0 Note the PRI(ority) column when you perform /usr/bin/top.=C2= =A0 > Processes with a PRI below the default kern.sched.preempt_thresh=3D80 > (ie nice -n 8) may pre-empt other processes or send interprocessor > interrupts to others (CPUs). I haven't got time to look into this in detail but from a cursory examination it looks like there must be some kind of translation between the PRI values seen in top and the priorities used in the scheduler. A threshold of 80 would be sensible in the context of top. I just ran a test and a cpu-bound process got a PRI of 85. But this seems to be a pure coincidence. kern.sched.preempt_thresh is being compared with the schedulers internal priorities and is defaulted to PRI_MIN_KERN, the highest priority in the kernel range, one level below realtime. =46rom sys/sys/priority.h #define PRI_MIN_REALTIME (48) #define PRI_MAX_REALTIME (PRI_MIN_KERN - 1) #define PRI_MIN_KERN (80) #define PRI_MAX_KERN (PRI_MIN_TIMESHARE - 1) #define PRI_MIN_TIMESHARE (120) #define PRI_MAX_TIMESHARE (PRI_MIN_IDLE - 1) #define PRI_MIN_IDLE (224) > idprio 0 top > is assigned a starting PRI of 124; so on SCHED_ULE, these processes > will receive cpu time (even at idprio 31) but won't pre-empt others. >=20 > If you really want all processes to pre-empt others, enabling > FULL_PREEMPTION achieves the same goal as 224.=C2=A0 I don't have a use > case for no pre-emption. Anyone? >=20 > Why kern.sched.preempt_thresh=3D224 helps desktop users, I can only > speculate that with a high threshold, more IPI's are sent to other CPU > cores so they can be busy (?).=C2=A0 Refer to > /usr/src/sys/kern/sched_ule.c -- > Returning to the topic.=C2=A0 Its a very hard choice between schedulers.= =C2=A0 I > did a lot of testing between them and tuning to see if one excelled on > my humble Xeon-E3.=C2=A0 I couldn't see a significant difference between > workloads - though next time (and a hint for others) I'll disable SMT > and set dev.cpu.0.freq to disable turbo behaviour.=C2=A0 For now, > sched_4bsd appears to be more efficient in terms of code complexity > and people with high CPU workloads have preferred sched_4bsd in the > past, while sched_ule has a lot of things to tweak and is recommended > by the FreeBSD project. Otherwise it wouldn't be the default=C2=A0=20 >=20 > Looking at > https://github.com/freebsd/freebsd-src/tree/main/sys/kern/sched_*.c=C2=A0 > their histories are tweaked a couple of times a year, so I wouldn't > rule sched_4bsd out of contention just yet.=20 >=20 > FWIW, my servers modify only: > kern.sched.affinity=3D7 > kern.sched.interact=3D0 > kern.sched.slice=3D128 > while firewalls: > kern.sched.balance=3D0 > kern.sched.interact=3D0 >=20 > A loadable schedule has been discussed here a few times - I vaguely > recall it being inefficient (complexity) and unnecessary (you'll > determine one scheduler and unless testing, unlikely to change).=C2=A0 > Further in the past, sched_4bsd was to be removed, but some > demonstrated it had better performance for their workload. > Cheerio. >=20 >=20 >=20 >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20210719013743.0590b1f2>