Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 May 2022 18:50:08 -0600
From:      Alan Somers <asomers@freebsd.org>
To:        Kubilay Kocak <koobs@freebsd.org>
Cc:        Alan Somers <asomers@freebsd.org>, src-committers <src-committers@freebsd.org>,  "<dev-commits-src-all@freebsd.org>" <dev-commits-src-all@freebsd.org>, dev-commits-src-main@freebsd.org
Subject:   Re: git: 1d2421ad8b6d - main - Correctly measure system load averages > 1024
Message-ID:  <CAOtMX2gr8HY6mK%2BU1QPV21zpthz5WFSgkuv_c3-scgm80iY8CA@mail.gmail.com>
In-Reply-To: <771111e0-5c1b-8eb3-751d-c5f2b8bc36eb@FreeBSD.org>
References:  <202205070004.24704iIx031164@gitrepo.freebsd.org> <771111e0-5c1b-8eb3-751d-c5f2b8bc36eb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000000958ad05de615c0b
Content-Type: text/plain; charset="UTF-8"

Yes, it can be MFCd.  The only risk I'm aware of is that the 4.4 bsd
scheduler might start acting weird - once the load average gets close to
one million.

On Fri, May 6, 2022, 6:06 PM Kubilay Kocak <koobs@freebsd.org> wrote:

> On 7/05/2022 10:04 am, Alan Somers wrote:
> > The branch main has been updated by asomers:
> >
> > URL:
> https://cgit.FreeBSD.org/src/commit/?id=1d2421ad8b6d508ef155752bdfc5948f7373bac3
> >
> > commit 1d2421ad8b6d508ef155752bdfc5948f7373bac3
> > Author:     Alan Somers <asomers@FreeBSD.org>
> > AuthorDate: 2022-05-05 21:35:23 +0000
> > Commit:     Alan Somers <asomers@FreeBSD.org>
> > CommitDate: 2022-05-06 23:25:43 +0000
> >
> >      Correctly measure system load averages > 1024
> >
> >      The old fixed-point arithmetic used for calculating load averages
> had an
> >      overflow at 1024.  So on systems with extremely high load, the
> observed
> >      load average would actually fall back to 0 and shoot up again,
> creating
> >      a kind of sawtooth graph.
> >
> >      Fix this by using 64-bit math internally, while still reporting the
> load
> >      average to userspace as a 32-bit number.
> >
> >      Sponsored by:   Axcient
> >      Reviewed by:    imp
> >      Differential Revision: https://reviews.freebsd.org/D35134
>
> Can MFC?
>
> > ---
> >   sys/kern/kern_synch.c | 9 +++++----
> >   sys/kern/tty_info.c   | 2 +-
> >   sys/sys/param.h       | 8 ++++----
> >   3 files changed, 10 insertions(+), 9 deletions(-)
> >
> > diff --git a/sys/kern/kern_synch.c b/sys/kern/kern_synch.c
> > index e78878987b57..381d6315044c 100644
> > --- a/sys/kern/kern_synch.c
> > +++ b/sys/kern/kern_synch.c
> > @@ -87,7 +87,7 @@ struct loadavg averunnable =
> >    * Constants for averages over 1, 5, and 15 minutes
> >    * when sampling at 5 second intervals.
> >    */
> > -static fixpt_t cexp[3] = {
> > +static uint64_t cexp[3] = {
> >       0.9200444146293232 * FSCALE,    /* exp(-1/12) */
> >       0.9834714538216174 * FSCALE,    /* exp(-1/60) */
> >       0.9944598480048967 * FSCALE,    /* exp(-1/180) */
> > @@ -611,14 +611,15 @@ setrunnable(struct thread *td, int srqflags)
> >   static void
> >   loadav(void *arg)
> >   {
> > -     int i, nrun;
> > +     int i;
> > +     uint64_t nrun;
> >       struct loadavg *avg;
> >
> > -     nrun = sched_load();
> > +     nrun = (uint64_t)sched_load();
> >       avg = &averunnable;
> >
> >       for (i = 0; i < 3; i++)
> > -             avg->ldavg[i] = (cexp[i] * avg->ldavg[i] +
> > +             avg->ldavg[i] = (cexp[i] * (uint64_t)avg->ldavg[i] +
> >                   nrun * FSCALE * (FSCALE - cexp[i])) >> FSHIFT;
> >
> >       /*
> > diff --git a/sys/kern/tty_info.c b/sys/kern/tty_info.c
> > index 60675557e4ed..237aa47a18da 100644
> > --- a/sys/kern/tty_info.c
> > +++ b/sys/kern/tty_info.c
> > @@ -302,7 +302,7 @@ tty_info(struct tty *tp)
> >       sbuf_set_drain(&sb, sbuf_tty_drain, tp);
> >
> >       /* Print load average. */
> > -     load = (averunnable.ldavg[0] * 100 + FSCALE / 2) >> FSHIFT;
> > +     load = ((int64_t)averunnable.ldavg[0] * 100 + FSCALE / 2) >>
> FSHIFT;
> >       sbuf_printf(&sb, "%sload: %d.%02d ", tp->t_column == 0 ? "" : "\n",
> >           load / 100, load % 100);
> >
> > diff --git a/sys/sys/param.h b/sys/sys/param.h
> > index 2d463b9ac7a2..b0b53f1a7776 100644
> > --- a/sys/sys/param.h
> > +++ b/sys/sys/param.h
> > @@ -361,12 +361,12 @@ __END_DECLS
> >    * Scale factor for scaled integers used to count %cpu time and load
> avgs.
> >    *
> >    * The number of CPU `tick's that map to a unique `%age' can be
> expressed
> > - * by the formula (1 / (2 ^ (FSHIFT - 11))).  The maximum load average
> that
> > - * can be calculated (assuming 32 bits) can be closely approximated
> using
> > - * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15).
> > + * by the formula (1 / (2 ^ (FSHIFT - 11))).  Since the intermediate
> > + * calculation is done with 64-bit precision, the maximum load average
> that can
> > + * be calculated is approximately 2^32 / FSCALE.
> >    *
> >    * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age',
> > - * FSHIFT must be at least 11; this gives us a maximum load avg of
> ~1024.
> > + * FSHIFT must be at least 11.  This gives a maximum load avg of 2
> million.
> >    */
> >   #define     FSHIFT  11              /* bits to right of fixed binary
> point */
> >   #define FSCALE      (1<<FSHIFT)
> >
>
>
>

--0000000000000958ad05de615c0b
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"auto"><div>Yes, it can be MFCd.=C2=A0 The only risk I&#39;m awa=
re of is that the 4.4 bsd scheduler might start acting weird - once the loa=
d average gets close to one million.<br><br><div class=3D"gmail_quote"><div=
 dir=3D"ltr" class=3D"gmail_attr">On Fri, May 6, 2022, 6:06 PM Kubilay Koca=
k &lt;<a href=3D"mailto:koobs@freebsd.org">koobs@freebsd.org</a>&gt; wrote:=
<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bord=
er-left:1px #ccc solid;padding-left:1ex">On 7/05/2022 10:04 am, Alan Somers=
 wrote:<br>
&gt; The branch main has been updated by asomers:<br>
&gt; <br>
&gt; URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3D1d2421ad8b6d=
508ef155752bdfc5948f7373bac3" rel=3D"noreferrer noreferrer" target=3D"_blan=
k">https://cgit.FreeBSD.org/src/commit/?id=3D1d2421ad8b6d508ef155752bdfc594=
8f7373bac3</a><br>
&gt; <br>
&gt; commit 1d2421ad8b6d508ef155752bdfc5948f7373bac3<br>
&gt; Author:=C2=A0 =C2=A0 =C2=A0Alan Somers &lt;asomers@FreeBSD.org&gt;<br>
&gt; AuthorDate: 2022-05-05 21:35:23 +0000<br>
&gt; Commit:=C2=A0 =C2=A0 =C2=A0Alan Somers &lt;asomers@FreeBSD.org&gt;<br>
&gt; CommitDate: 2022-05-06 23:25:43 +0000<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0 Correctly measure system load averages &gt; 1024<b=
r>
&gt;=C2=A0 =C2=A0 =C2=A0 <br>
&gt;=C2=A0 =C2=A0 =C2=A0 The old fixed-point arithmetic used for calculatin=
g load averages had an<br>
&gt;=C2=A0 =C2=A0 =C2=A0 overflow at 1024.=C2=A0 So on systems with extreme=
ly high load, the observed<br>
&gt;=C2=A0 =C2=A0 =C2=A0 load average would actually fall back to 0 and sho=
ot up again, creating<br>
&gt;=C2=A0 =C2=A0 =C2=A0 a kind of sawtooth graph.<br>
&gt;=C2=A0 =C2=A0 =C2=A0 <br>
&gt;=C2=A0 =C2=A0 =C2=A0 Fix this by using 64-bit math internally, while st=
ill reporting the load<br>
&gt;=C2=A0 =C2=A0 =C2=A0 average to userspace as a 32-bit number.<br>
&gt;=C2=A0 =C2=A0 =C2=A0 <br>
&gt;=C2=A0 =C2=A0 =C2=A0 Sponsored by:=C2=A0 =C2=A0Axcient<br>
&gt;=C2=A0 =C2=A0 =C2=A0 Reviewed by:=C2=A0 =C2=A0 imp<br>
&gt;=C2=A0 =C2=A0 =C2=A0 Differential Revision: <a href=3D"https://reviews.=
freebsd.org/D35134" rel=3D"noreferrer noreferrer" target=3D"_blank">https:/=
/reviews.freebsd.org/D35134</a><br>
<br>
Can MFC?<br>
<br>
&gt; ---<br>
&gt;=C2=A0 =C2=A0sys/kern/kern_synch.c | 9 +++++----<br>
&gt;=C2=A0 =C2=A0sys/kern/tty_info.c=C2=A0 =C2=A0| 2 +-<br>
&gt;=C2=A0 =C2=A0sys/sys/param.h=C2=A0 =C2=A0 =C2=A0 =C2=A0| 8 ++++----<br>
&gt;=C2=A0 =C2=A03 files changed, 10 insertions(+), 9 deletions(-)<br>
&gt; <br>
&gt; diff --git a/sys/kern/kern_synch.c b/sys/kern/kern_synch.c<br>
&gt; index e78878987b57..381d6315044c 100644<br>
&gt; --- a/sys/kern/kern_synch.c<br>
&gt; +++ b/sys/kern/kern_synch.c<br>
&gt; @@ -87,7 +87,7 @@ struct loadavg averunnable =3D<br>
&gt;=C2=A0 =C2=A0 * Constants for averages over 1, 5, and 15 minutes<br>
&gt;=C2=A0 =C2=A0 * when sampling at 5 second intervals.<br>
&gt;=C2=A0 =C2=A0 */<br>
&gt; -static fixpt_t cexp[3] =3D {<br>
&gt; +static uint64_t cexp[3] =3D {<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A00.9200444146293232 * FSCALE,=C2=A0 =C2=A0 /*=
 exp(-1/12) */<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A00.9834714538216174 * FSCALE,=C2=A0 =C2=A0 /*=
 exp(-1/60) */<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A00.9944598480048967 * FSCALE,=C2=A0 =C2=A0 /*=
 exp(-1/180) */<br>
&gt; @@ -611,14 +611,15 @@ setrunnable(struct thread *td, int srqflags)<br>
&gt;=C2=A0 =C2=A0static void<br>
&gt;=C2=A0 =C2=A0loadav(void *arg)<br>
&gt;=C2=A0 =C2=A0{<br>
&gt; -=C2=A0 =C2=A0 =C2=A0int i, nrun;<br>
&gt; +=C2=A0 =C2=A0 =C2=A0int i;<br>
&gt; +=C2=A0 =C2=A0 =C2=A0uint64_t nrun;<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0struct loadavg *avg;<br>
&gt;=C2=A0 =C2=A0<br>
&gt; -=C2=A0 =C2=A0 =C2=A0nrun =3D sched_load();<br>
&gt; +=C2=A0 =C2=A0 =C2=A0nrun =3D (uint64_t)sched_load();<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0avg =3D &amp;averunnable;<br>
&gt;=C2=A0 =C2=A0<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0for (i =3D 0; i &lt; 3; i++)<br>
&gt; -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0avg-&gt;ldavg[i] =3D =
(cexp[i] * avg-&gt;ldavg[i] +<br>
&gt; +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0avg-&gt;ldavg[i] =3D =
(cexp[i] * (uint64_t)avg-&gt;ldavg[i] +<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0nr=
un * FSCALE * (FSCALE - cexp[i])) &gt;&gt; FSHIFT;<br>
&gt;=C2=A0 =C2=A0<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0/*<br>
&gt; diff --git a/sys/kern/tty_info.c b/sys/kern/tty_info.c<br>
&gt; index 60675557e4ed..237aa47a18da 100644<br>
&gt; --- a/sys/kern/tty_info.c<br>
&gt; +++ b/sys/kern/tty_info.c<br>
&gt; @@ -302,7 +302,7 @@ tty_info(struct tty *tp)<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0sbuf_set_drain(&amp;sb, sbuf_tty_drain, tp);=
<br>
&gt;=C2=A0 =C2=A0<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0/* Print load average. */<br>
&gt; -=C2=A0 =C2=A0 =C2=A0load =3D (averunnable.ldavg[0] * 100 + FSCALE / 2=
) &gt;&gt; FSHIFT;<br>
&gt; +=C2=A0 =C2=A0 =C2=A0load =3D ((int64_t)averunnable.ldavg[0] * 100 + F=
SCALE / 2) &gt;&gt; FSHIFT;<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0sbuf_printf(&amp;sb, &quot;%sload: %d.%02d &=
quot;, tp-&gt;t_column =3D=3D 0 ? &quot;&quot; : &quot;\n&quot;,<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0load / 100, load % 100);<br>
&gt;=C2=A0 =C2=A0<br>
&gt; diff --git a/sys/sys/param.h b/sys/sys/param.h<br>
&gt; index 2d463b9ac7a2..b0b53f1a7776 100644<br>
&gt; --- a/sys/sys/param.h<br>
&gt; +++ b/sys/sys/param.h<br>
&gt; @@ -361,12 +361,12 @@ __END_DECLS<br>
&gt;=C2=A0 =C2=A0 * Scale factor for scaled integers used to count %cpu tim=
e and load avgs.<br>
&gt;=C2=A0 =C2=A0 *<br>
&gt;=C2=A0 =C2=A0 * The number of CPU `tick&#39;s that map to a unique `%ag=
e&#39; can be expressed<br>
&gt; - * by the formula (1 / (2 ^ (FSHIFT - 11))).=C2=A0 The maximum load a=
verage that<br>
&gt; - * can be calculated (assuming 32 bits) can be closely approximated u=
sing<br>
&gt; - * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT &lt; 15).<br>
&gt; + * by the formula (1 / (2 ^ (FSHIFT - 11))).=C2=A0 Since the intermed=
iate<br>
&gt; + * calculation is done with 64-bit precision, the maximum load averag=
e that can<br>
&gt; + * be calculated is approximately 2^32 / FSCALE.<br>
&gt;=C2=A0 =C2=A0 *<br>
&gt;=C2=A0 =C2=A0 * For the scheduler to maintain a 1:1 mapping of CPU `tic=
k&#39; to `%age&#39;,<br>
&gt; - * FSHIFT must be at least 11; this gives us a maximum load avg of ~1=
024.<br>
&gt; + * FSHIFT must be at least 11.=C2=A0 This gives a maximum load avg of=
 2 million.<br>
&gt;=C2=A0 =C2=A0 */<br>
&gt;=C2=A0 =C2=A0#define=C2=A0 =C2=A0 =C2=A0FSHIFT=C2=A0 11=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* bits to right of fixed binary point *=
/<br>
&gt;=C2=A0 =C2=A0#define FSCALE=C2=A0 =C2=A0 =C2=A0 (1&lt;&lt;FSHIFT)<br>
&gt; <br>
<br>
<br>
</blockquote></div></div></div>

--0000000000000958ad05de615c0b--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2gr8HY6mK%2BU1QPV21zpthz5WFSgkuv_c3-scgm80iY8CA>