Date: Fri, 6 May 2022 18:50:08 -0600 From: Alan Somers <asomers@freebsd.org> To: Kubilay Kocak <koobs@freebsd.org> Cc: Alan Somers <asomers@freebsd.org>, src-committers <src-committers@freebsd.org>, "<dev-commits-src-all@freebsd.org>" <dev-commits-src-all@freebsd.org>, dev-commits-src-main@freebsd.org Subject: Re: git: 1d2421ad8b6d - main - Correctly measure system load averages > 1024 Message-ID: <CAOtMX2gr8HY6mK%2BU1QPV21zpthz5WFSgkuv_c3-scgm80iY8CA@mail.gmail.com> In-Reply-To: <771111e0-5c1b-8eb3-751d-c5f2b8bc36eb@FreeBSD.org> References: <202205070004.24704iIx031164@gitrepo.freebsd.org> <771111e0-5c1b-8eb3-751d-c5f2b8bc36eb@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000000958ad05de615c0b Content-Type: text/plain; charset="UTF-8" Yes, it can be MFCd. The only risk I'm aware of is that the 4.4 bsd scheduler might start acting weird - once the load average gets close to one million. On Fri, May 6, 2022, 6:06 PM Kubilay Kocak <koobs@freebsd.org> wrote: > On 7/05/2022 10:04 am, Alan Somers wrote: > > The branch main has been updated by asomers: > > > > URL: > https://cgit.FreeBSD.org/src/commit/?id=1d2421ad8b6d508ef155752bdfc5948f7373bac3 > > > > commit 1d2421ad8b6d508ef155752bdfc5948f7373bac3 > > Author: Alan Somers <asomers@FreeBSD.org> > > AuthorDate: 2022-05-05 21:35:23 +0000 > > Commit: Alan Somers <asomers@FreeBSD.org> > > CommitDate: 2022-05-06 23:25:43 +0000 > > > > Correctly measure system load averages > 1024 > > > > The old fixed-point arithmetic used for calculating load averages > had an > > overflow at 1024. So on systems with extremely high load, the > observed > > load average would actually fall back to 0 and shoot up again, > creating > > a kind of sawtooth graph. > > > > Fix this by using 64-bit math internally, while still reporting the > load > > average to userspace as a 32-bit number. > > > > Sponsored by: Axcient > > Reviewed by: imp > > Differential Revision: https://reviews.freebsd.org/D35134 > > Can MFC? > > > --- > > sys/kern/kern_synch.c | 9 +++++---- > > sys/kern/tty_info.c | 2 +- > > sys/sys/param.h | 8 ++++---- > > 3 files changed, 10 insertions(+), 9 deletions(-) > > > > diff --git a/sys/kern/kern_synch.c b/sys/kern/kern_synch.c > > index e78878987b57..381d6315044c 100644 > > --- a/sys/kern/kern_synch.c > > +++ b/sys/kern/kern_synch.c > > @@ -87,7 +87,7 @@ struct loadavg averunnable = > > * Constants for averages over 1, 5, and 15 minutes > > * when sampling at 5 second intervals. > > */ > > -static fixpt_t cexp[3] = { > > +static uint64_t cexp[3] = { > > 0.9200444146293232 * FSCALE, /* exp(-1/12) */ > > 0.9834714538216174 * FSCALE, /* exp(-1/60) */ > > 0.9944598480048967 * FSCALE, /* exp(-1/180) */ > > @@ -611,14 +611,15 @@ setrunnable(struct thread *td, int srqflags) > > static void > > loadav(void *arg) > > { > > - int i, nrun; > > + int i; > > + uint64_t nrun; > > struct loadavg *avg; > > > > - nrun = sched_load(); > > + nrun = (uint64_t)sched_load(); > > avg = &averunnable; > > > > for (i = 0; i < 3; i++) > > - avg->ldavg[i] = (cexp[i] * avg->ldavg[i] + > > + avg->ldavg[i] = (cexp[i] * (uint64_t)avg->ldavg[i] + > > nrun * FSCALE * (FSCALE - cexp[i])) >> FSHIFT; > > > > /* > > diff --git a/sys/kern/tty_info.c b/sys/kern/tty_info.c > > index 60675557e4ed..237aa47a18da 100644 > > --- a/sys/kern/tty_info.c > > +++ b/sys/kern/tty_info.c > > @@ -302,7 +302,7 @@ tty_info(struct tty *tp) > > sbuf_set_drain(&sb, sbuf_tty_drain, tp); > > > > /* Print load average. */ > > - load = (averunnable.ldavg[0] * 100 + FSCALE / 2) >> FSHIFT; > > + load = ((int64_t)averunnable.ldavg[0] * 100 + FSCALE / 2) >> > FSHIFT; > > sbuf_printf(&sb, "%sload: %d.%02d ", tp->t_column == 0 ? "" : "\n", > > load / 100, load % 100); > > > > diff --git a/sys/sys/param.h b/sys/sys/param.h > > index 2d463b9ac7a2..b0b53f1a7776 100644 > > --- a/sys/sys/param.h > > +++ b/sys/sys/param.h > > @@ -361,12 +361,12 @@ __END_DECLS > > * Scale factor for scaled integers used to count %cpu time and load > avgs. > > * > > * The number of CPU `tick's that map to a unique `%age' can be > expressed > > - * by the formula (1 / (2 ^ (FSHIFT - 11))). The maximum load average > that > > - * can be calculated (assuming 32 bits) can be closely approximated > using > > - * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15). > > + * by the formula (1 / (2 ^ (FSHIFT - 11))). Since the intermediate > > + * calculation is done with 64-bit precision, the maximum load average > that can > > + * be calculated is approximately 2^32 / FSCALE. > > * > > * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age', > > - * FSHIFT must be at least 11; this gives us a maximum load avg of > ~1024. > > + * FSHIFT must be at least 11. This gives a maximum load avg of 2 > million. > > */ > > #define FSHIFT 11 /* bits to right of fixed binary > point */ > > #define FSCALE (1<<FSHIFT) > > > > > --0000000000000958ad05de615c0b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto"><div>Yes, it can be MFCd.=C2=A0 The only risk I'm awa= re of is that the 4.4 bsd scheduler might start acting weird - once the loa= d average gets close to one million.<br><br><div class=3D"gmail_quote"><div= dir=3D"ltr" class=3D"gmail_attr">On Fri, May 6, 2022, 6:06 PM Kubilay Koca= k <<a href=3D"mailto:koobs@freebsd.org">koobs@freebsd.org</a>> wrote:= <br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bord= er-left:1px #ccc solid;padding-left:1ex">On 7/05/2022 10:04 am, Alan Somers= wrote:<br> > The branch main has been updated by asomers:<br> > <br> > URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3D1d2421ad8b6d= 508ef155752bdfc5948f7373bac3" rel=3D"noreferrer noreferrer" target=3D"_blan= k">https://cgit.FreeBSD.org/src/commit/?id=3D1d2421ad8b6d508ef155752bdfc594= 8f7373bac3</a><br> > <br> > commit 1d2421ad8b6d508ef155752bdfc5948f7373bac3<br> > Author:=C2=A0 =C2=A0 =C2=A0Alan Somers <asomers@FreeBSD.org><br> > AuthorDate: 2022-05-05 21:35:23 +0000<br> > Commit:=C2=A0 =C2=A0 =C2=A0Alan Somers <asomers@FreeBSD.org><br> > CommitDate: 2022-05-06 23:25:43 +0000<br> > <br> >=C2=A0 =C2=A0 =C2=A0 Correctly measure system load averages > 1024<b= r> >=C2=A0 =C2=A0 =C2=A0 <br> >=C2=A0 =C2=A0 =C2=A0 The old fixed-point arithmetic used for calculatin= g load averages had an<br> >=C2=A0 =C2=A0 =C2=A0 overflow at 1024.=C2=A0 So on systems with extreme= ly high load, the observed<br> >=C2=A0 =C2=A0 =C2=A0 load average would actually fall back to 0 and sho= ot up again, creating<br> >=C2=A0 =C2=A0 =C2=A0 a kind of sawtooth graph.<br> >=C2=A0 =C2=A0 =C2=A0 <br> >=C2=A0 =C2=A0 =C2=A0 Fix this by using 64-bit math internally, while st= ill reporting the load<br> >=C2=A0 =C2=A0 =C2=A0 average to userspace as a 32-bit number.<br> >=C2=A0 =C2=A0 =C2=A0 <br> >=C2=A0 =C2=A0 =C2=A0 Sponsored by:=C2=A0 =C2=A0Axcient<br> >=C2=A0 =C2=A0 =C2=A0 Reviewed by:=C2=A0 =C2=A0 imp<br> >=C2=A0 =C2=A0 =C2=A0 Differential Revision: <a href=3D"https://reviews.= freebsd.org/D35134" rel=3D"noreferrer noreferrer" target=3D"_blank">https:/= /reviews.freebsd.org/D35134</a><br> <br> Can MFC?<br> <br> > ---<br> >=C2=A0 =C2=A0sys/kern/kern_synch.c | 9 +++++----<br> >=C2=A0 =C2=A0sys/kern/tty_info.c=C2=A0 =C2=A0| 2 +-<br> >=C2=A0 =C2=A0sys/sys/param.h=C2=A0 =C2=A0 =C2=A0 =C2=A0| 8 ++++----<br> >=C2=A0 =C2=A03 files changed, 10 insertions(+), 9 deletions(-)<br> > <br> > diff --git a/sys/kern/kern_synch.c b/sys/kern/kern_synch.c<br> > index e78878987b57..381d6315044c 100644<br> > --- a/sys/kern/kern_synch.c<br> > +++ b/sys/kern/kern_synch.c<br> > @@ -87,7 +87,7 @@ struct loadavg averunnable =3D<br> >=C2=A0 =C2=A0 * Constants for averages over 1, 5, and 15 minutes<br> >=C2=A0 =C2=A0 * when sampling at 5 second intervals.<br> >=C2=A0 =C2=A0 */<br> > -static fixpt_t cexp[3] =3D {<br> > +static uint64_t cexp[3] =3D {<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A00.9200444146293232 * FSCALE,=C2=A0 =C2=A0 /*= exp(-1/12) */<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A00.9834714538216174 * FSCALE,=C2=A0 =C2=A0 /*= exp(-1/60) */<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A00.9944598480048967 * FSCALE,=C2=A0 =C2=A0 /*= exp(-1/180) */<br> > @@ -611,14 +611,15 @@ setrunnable(struct thread *td, int srqflags)<br> >=C2=A0 =C2=A0static void<br> >=C2=A0 =C2=A0loadav(void *arg)<br> >=C2=A0 =C2=A0{<br> > -=C2=A0 =C2=A0 =C2=A0int i, nrun;<br> > +=C2=A0 =C2=A0 =C2=A0int i;<br> > +=C2=A0 =C2=A0 =C2=A0uint64_t nrun;<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0struct loadavg *avg;<br> >=C2=A0 =C2=A0<br> > -=C2=A0 =C2=A0 =C2=A0nrun =3D sched_load();<br> > +=C2=A0 =C2=A0 =C2=A0nrun =3D (uint64_t)sched_load();<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0avg =3D &averunnable;<br> >=C2=A0 =C2=A0<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0for (i =3D 0; i < 3; i++)<br> > -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0avg->ldavg[i] =3D = (cexp[i] * avg->ldavg[i] +<br> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0avg->ldavg[i] =3D = (cexp[i] * (uint64_t)avg->ldavg[i] +<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0nr= un * FSCALE * (FSCALE - cexp[i])) >> FSHIFT;<br> >=C2=A0 =C2=A0<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0/*<br> > diff --git a/sys/kern/tty_info.c b/sys/kern/tty_info.c<br> > index 60675557e4ed..237aa47a18da 100644<br> > --- a/sys/kern/tty_info.c<br> > +++ b/sys/kern/tty_info.c<br> > @@ -302,7 +302,7 @@ tty_info(struct tty *tp)<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0sbuf_set_drain(&sb, sbuf_tty_drain, tp);= <br> >=C2=A0 =C2=A0<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0/* Print load average. */<br> > -=C2=A0 =C2=A0 =C2=A0load =3D (averunnable.ldavg[0] * 100 + FSCALE / 2= ) >> FSHIFT;<br> > +=C2=A0 =C2=A0 =C2=A0load =3D ((int64_t)averunnable.ldavg[0] * 100 + F= SCALE / 2) >> FSHIFT;<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0sbuf_printf(&sb, "%sload: %d.%02d &= quot;, tp->t_column =3D=3D 0 ? "" : "\n",<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0load / 100, load % 100);<br> >=C2=A0 =C2=A0<br> > diff --git a/sys/sys/param.h b/sys/sys/param.h<br> > index 2d463b9ac7a2..b0b53f1a7776 100644<br> > --- a/sys/sys/param.h<br> > +++ b/sys/sys/param.h<br> > @@ -361,12 +361,12 @@ __END_DECLS<br> >=C2=A0 =C2=A0 * Scale factor for scaled integers used to count %cpu tim= e and load avgs.<br> >=C2=A0 =C2=A0 *<br> >=C2=A0 =C2=A0 * The number of CPU `tick's that map to a unique `%ag= e' can be expressed<br> > - * by the formula (1 / (2 ^ (FSHIFT - 11))).=C2=A0 The maximum load a= verage that<br> > - * can be calculated (assuming 32 bits) can be closely approximated u= sing<br> > - * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15).<br> > + * by the formula (1 / (2 ^ (FSHIFT - 11))).=C2=A0 Since the intermed= iate<br> > + * calculation is done with 64-bit precision, the maximum load averag= e that can<br> > + * be calculated is approximately 2^32 / FSCALE.<br> >=C2=A0 =C2=A0 *<br> >=C2=A0 =C2=A0 * For the scheduler to maintain a 1:1 mapping of CPU `tic= k' to `%age',<br> > - * FSHIFT must be at least 11; this gives us a maximum load avg of ~1= 024.<br> > + * FSHIFT must be at least 11.=C2=A0 This gives a maximum load avg of= 2 million.<br> >=C2=A0 =C2=A0 */<br> >=C2=A0 =C2=A0#define=C2=A0 =C2=A0 =C2=A0FSHIFT=C2=A0 11=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* bits to right of fixed binary point *= /<br> >=C2=A0 =C2=A0#define FSCALE=C2=A0 =C2=A0 =C2=A0 (1<<FSHIFT)<br> > <br> <br> <br> </blockquote></div></div></div> --0000000000000958ad05de615c0b--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2gr8HY6mK%2BU1QPV21zpthz5WFSgkuv_c3-scgm80iY8CA>