From nobody Sat May 7 00:57:33 2022 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4FDCB1ACD8AF for ; Sat, 7 May 2022 00:57:45 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-vs1-xe2c.google.com (mail-vs1-xe2c.google.com [IPv6:2607:f8b0:4864:20::e2c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Kw8DT1YJ9z4XJD for ; Sat, 7 May 2022 00:57:45 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-vs1-xe2c.google.com with SMTP id e19so8738505vsu.12 for ; Fri, 06 May 2022 17:57:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ukSlAnr7sGJxV7SXZoTHRafIblB5WyCkGfjLMuP3LIc=; b=xCldjsQpWywLSI3k30mxch1tx/46qzeD7DDzn+pBmkfMiC8GU1Eolnr3PXbg6LSpAf tFrIyJLFwtass9lDCmsvODk+7XObL+1uGj1TWmVJILj/cPso+Lzdt7MrD1dMtjKPgvr3 XewuCLHgFU5kFyo4H6Dl68Tt2msa5R+QLx6z6huqfRfogroI5W864v0yk6NVEcrhu2Q9 FAjUO1rY9FXPZuNt1RlklC0P3abdh2smp9a4MKHk76Mfcp2aW4y7E0nu+bs6ecS8iq6m kyBPV0ZqrvdcPRN+W5FRZ9VPUTfJDM3Z7og9sWH0BVotoJr/b/H2k68LrqPcAfCtGFUO aTew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ukSlAnr7sGJxV7SXZoTHRafIblB5WyCkGfjLMuP3LIc=; b=uMPFNJqjHD9aIBC43ujWaBmerEhMxH7NxMeYy9Wus0qA6RKhbWZacTmW/cciadwSDN Aj5f2RBJGsA/Szmptg6EfWHRGnww4vzpHgjHHn+r2YNaUQonUmtUaV/57vuJ6j7TM0Gk kEHyyi5oYZjJ/Cy6yyrmcIlT7DCGR5utU3BLWune2x1iyGr/vcBoLgxw0bsYJWz84YwK 689qfPvdeuED6z+RvvxUZRD2+o4Z/6qtKIAiQxmrT0cjVydfr5iO+bQO2JEzcB6j2m8S LIs7QiUuBQj4Aca5yk5cOIECJ3gr++rO5oRH6rUBLyfe0A8AolrBmeShYcn38YC6aQFj 6CVg== X-Gm-Message-State: AOAM530m0QPtvSLTvxyq6datFHFy5z31ldKQx3+wSGCiDribeyLpzjzh ujyKpM9j5LHc3ZE+qeYCp1J/5rvgvQqEwrLsnWmY6Q== X-Google-Smtp-Source: ABdhPJySlmMQ4Y+TO+SeQG0rAL9VwEmBHQWJ1EhDJEOL8W5PQKLs00aas+6uf7zU5tkPli6oGhIgtyl1qH0le9JFRDA= X-Received: by 2002:a67:746:0:b0:328:33e3:6079 with SMTP id 67-20020a670746000000b0032833e36079mr2137048vsh.17.1651885064539; Fri, 06 May 2022 17:57:44 -0700 (PDT) List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 References: <202205070004.24704iIx031164@gitrepo.freebsd.org> <771111e0-5c1b-8eb3-751d-c5f2b8bc36eb@FreeBSD.org> In-Reply-To: From: Warner Losh Date: Fri, 6 May 2022 18:57:33 -0600 Message-ID: Subject: Re: git: 1d2421ad8b6d - main - Correctly measure system load averages > 1024 To: Alan Somers Cc: Kubilay Kocak , src-committers , "" , dev-commits-src-main@freebsd.org Content-Type: multipart/alternative; boundary="0000000000007b638f05de61765c" X-Rspamd-Queue-Id: 4Kw8DT1YJ9z4XJD X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N --0000000000007b638f05de61765c Content-Type: text/plain; charset="UTF-8" I'd expect it to die of lock contention overload well before a load average of 1,000,000... So I think we're safe Warner On Fri, May 6, 2022 at 6:50 PM Alan Somers wrote: > Yes, it can be MFCd. The only risk I'm aware of is that the 4.4 bsd > scheduler might start acting weird - once the load average gets close to > one million. > > On Fri, May 6, 2022, 6:06 PM Kubilay Kocak wrote: > >> On 7/05/2022 10:04 am, Alan Somers wrote: >> > The branch main has been updated by asomers: >> > >> > URL: >> https://cgit.FreeBSD.org/src/commit/?id=1d2421ad8b6d508ef155752bdfc5948f7373bac3 >> > >> > commit 1d2421ad8b6d508ef155752bdfc5948f7373bac3 >> > Author: Alan Somers >> > AuthorDate: 2022-05-05 21:35:23 +0000 >> > Commit: Alan Somers >> > CommitDate: 2022-05-06 23:25:43 +0000 >> > >> > Correctly measure system load averages > 1024 >> > >> > The old fixed-point arithmetic used for calculating load averages >> had an >> > overflow at 1024. So on systems with extremely high load, the >> observed >> > load average would actually fall back to 0 and shoot up again, >> creating >> > a kind of sawtooth graph. >> > >> > Fix this by using 64-bit math internally, while still reporting >> the load >> > average to userspace as a 32-bit number. >> > >> > Sponsored by: Axcient >> > Reviewed by: imp >> > Differential Revision: https://reviews.freebsd.org/D35134 >> >> Can MFC? >> >> > --- >> > sys/kern/kern_synch.c | 9 +++++---- >> > sys/kern/tty_info.c | 2 +- >> > sys/sys/param.h | 8 ++++---- >> > 3 files changed, 10 insertions(+), 9 deletions(-) >> > >> > diff --git a/sys/kern/kern_synch.c b/sys/kern/kern_synch.c >> > index e78878987b57..381d6315044c 100644 >> > --- a/sys/kern/kern_synch.c >> > +++ b/sys/kern/kern_synch.c >> > @@ -87,7 +87,7 @@ struct loadavg averunnable = >> > * Constants for averages over 1, 5, and 15 minutes >> > * when sampling at 5 second intervals. >> > */ >> > -static fixpt_t cexp[3] = { >> > +static uint64_t cexp[3] = { >> > 0.9200444146293232 * FSCALE, /* exp(-1/12) */ >> > 0.9834714538216174 * FSCALE, /* exp(-1/60) */ >> > 0.9944598480048967 * FSCALE, /* exp(-1/180) */ >> > @@ -611,14 +611,15 @@ setrunnable(struct thread *td, int srqflags) >> > static void >> > loadav(void *arg) >> > { >> > - int i, nrun; >> > + int i; >> > + uint64_t nrun; >> > struct loadavg *avg; >> > >> > - nrun = sched_load(); >> > + nrun = (uint64_t)sched_load(); >> > avg = &averunnable; >> > >> > for (i = 0; i < 3; i++) >> > - avg->ldavg[i] = (cexp[i] * avg->ldavg[i] + >> > + avg->ldavg[i] = (cexp[i] * (uint64_t)avg->ldavg[i] + >> > nrun * FSCALE * (FSCALE - cexp[i])) >> FSHIFT; >> > >> > /* >> > diff --git a/sys/kern/tty_info.c b/sys/kern/tty_info.c >> > index 60675557e4ed..237aa47a18da 100644 >> > --- a/sys/kern/tty_info.c >> > +++ b/sys/kern/tty_info.c >> > @@ -302,7 +302,7 @@ tty_info(struct tty *tp) >> > sbuf_set_drain(&sb, sbuf_tty_drain, tp); >> > >> > /* Print load average. */ >> > - load = (averunnable.ldavg[0] * 100 + FSCALE / 2) >> FSHIFT; >> > + load = ((int64_t)averunnable.ldavg[0] * 100 + FSCALE / 2) >> >> FSHIFT; >> > sbuf_printf(&sb, "%sload: %d.%02d ", tp->t_column == 0 ? "" : >> "\n", >> > load / 100, load % 100); >> > >> > diff --git a/sys/sys/param.h b/sys/sys/param.h >> > index 2d463b9ac7a2..b0b53f1a7776 100644 >> > --- a/sys/sys/param.h >> > +++ b/sys/sys/param.h >> > @@ -361,12 +361,12 @@ __END_DECLS >> > * Scale factor for scaled integers used to count %cpu time and load >> avgs. >> > * >> > * The number of CPU `tick's that map to a unique `%age' can be >> expressed >> > - * by the formula (1 / (2 ^ (FSHIFT - 11))). The maximum load average >> that >> > - * can be calculated (assuming 32 bits) can be closely approximated >> using >> > - * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15). >> > + * by the formula (1 / (2 ^ (FSHIFT - 11))). Since the intermediate >> > + * calculation is done with 64-bit precision, the maximum load average >> that can >> > + * be calculated is approximately 2^32 / FSCALE. >> > * >> > * For the scheduler to maintain a 1:1 mapping of CPU `tick' to >> `%age', >> > - * FSHIFT must be at least 11; this gives us a maximum load avg of >> ~1024. >> > + * FSHIFT must be at least 11. This gives a maximum load avg of 2 >> million. >> > */ >> > #define FSHIFT 11 /* bits to right of fixed binary >> point */ >> > #define FSCALE (1<> > >> >> >> --0000000000007b638f05de61765c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I'd expect it to die of lock contention overload well = before a load average of 1,000,000... So I think we're safe
Warner

On Fri, May 6, 2022 at 6:50 PM Alan Somers <asomers@freebsd.org> wrote:
Yes, it can be MFCd.=C2=A0 The only risk I'm aware of is that the 4.4= bsd scheduler might start acting weird - once the load average gets close = to one million.

On Fri, May 6, 2022, 6:06 PM Kubilay Kocak <koobs@freebsd.org> wrote:
On 7/05/2022 10:04= am, Alan Somers wrote:
> The branch main has been updated by asomers:
>
> URL: https://cgit.FreeBSD.org/src/commit/?id=3D1d2421ad8b6d508ef155752bdfc594= 8f7373bac3
>
> commit 1d2421ad8b6d508ef155752bdfc5948f7373bac3
> Author:=C2=A0 =C2=A0 =C2=A0Alan Somers <asomers@FreeBSD.org>
> AuthorDate: 2022-05-05 21:35:23 +0000
> Commit:=C2=A0 =C2=A0 =C2=A0Alan Somers <asomers@FreeBSD.org>
> CommitDate: 2022-05-06 23:25:43 +0000
>
>=C2=A0 =C2=A0 =C2=A0 Correctly measure system load averages > 1024 >=C2=A0 =C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 The old fixed-point arithmetic used for calculatin= g load averages had an
>=C2=A0 =C2=A0 =C2=A0 overflow at 1024.=C2=A0 So on systems with extreme= ly high load, the observed
>=C2=A0 =C2=A0 =C2=A0 load average would actually fall back to 0 and sho= ot up again, creating
>=C2=A0 =C2=A0 =C2=A0 a kind of sawtooth graph.
>=C2=A0 =C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 Fix this by using 64-bit math internally, while st= ill reporting the load
>=C2=A0 =C2=A0 =C2=A0 average to userspace as a 32-bit number.
>=C2=A0 =C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 Sponsored by:=C2=A0 =C2=A0Axcient
>=C2=A0 =C2=A0 =C2=A0 Reviewed by:=C2=A0 =C2=A0 imp
>=C2=A0 =C2=A0 =C2=A0 Differential Revision: https:/= /reviews.freebsd.org/D35134

Can MFC?

> ---
>=C2=A0 =C2=A0sys/kern/kern_synch.c | 9 +++++----
>=C2=A0 =C2=A0sys/kern/tty_info.c=C2=A0 =C2=A0| 2 +-
>=C2=A0 =C2=A0sys/sys/param.h=C2=A0 =C2=A0 =C2=A0 =C2=A0| 8 ++++----
>=C2=A0 =C2=A03 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/sys/kern/kern_synch.c b/sys/kern/kern_synch.c
> index e78878987b57..381d6315044c 100644
> --- a/sys/kern/kern_synch.c
> +++ b/sys/kern/kern_synch.c
> @@ -87,7 +87,7 @@ struct loadavg averunnable =3D
>=C2=A0 =C2=A0 * Constants for averages over 1, 5, and 15 minutes
>=C2=A0 =C2=A0 * when sampling at 5 second intervals.
>=C2=A0 =C2=A0 */
> -static fixpt_t cexp[3] =3D {
> +static uint64_t cexp[3] =3D {
>=C2=A0 =C2=A0 =C2=A0 =C2=A00.9200444146293232 * FSCALE,=C2=A0 =C2=A0 /*= exp(-1/12) */
>=C2=A0 =C2=A0 =C2=A0 =C2=A00.9834714538216174 * FSCALE,=C2=A0 =C2=A0 /*= exp(-1/60) */
>=C2=A0 =C2=A0 =C2=A0 =C2=A00.9944598480048967 * FSCALE,=C2=A0 =C2=A0 /*= exp(-1/180) */
> @@ -611,14 +611,15 @@ setrunnable(struct thread *td, int srqflags)
>=C2=A0 =C2=A0static void
>=C2=A0 =C2=A0loadav(void *arg)
>=C2=A0 =C2=A0{
> -=C2=A0 =C2=A0 =C2=A0int i, nrun;
> +=C2=A0 =C2=A0 =C2=A0int i;
> +=C2=A0 =C2=A0 =C2=A0uint64_t nrun;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0struct loadavg *avg;
>=C2=A0 =C2=A0
> -=C2=A0 =C2=A0 =C2=A0nrun =3D sched_load();
> +=C2=A0 =C2=A0 =C2=A0nrun =3D (uint64_t)sched_load();
>=C2=A0 =C2=A0 =C2=A0 =C2=A0avg =3D &averunnable;
>=C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0for (i =3D 0; i < 3; i++)
> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0avg->ldavg[i] =3D = (cexp[i] * avg->ldavg[i] +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0avg->ldavg[i] =3D = (cexp[i] * (uint64_t)avg->ldavg[i] +
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0nr= un * FSCALE * (FSCALE - cexp[i])) >> FSHIFT;
>=C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> diff --git a/sys/kern/tty_info.c b/sys/kern/tty_info.c
> index 60675557e4ed..237aa47a18da 100644
> --- a/sys/kern/tty_info.c
> +++ b/sys/kern/tty_info.c
> @@ -302,7 +302,7 @@ tty_info(struct tty *tp)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0sbuf_set_drain(&sb, sbuf_tty_drain, tp);=
>=C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0/* Print load average. */
> -=C2=A0 =C2=A0 =C2=A0load =3D (averunnable.ldavg[0] * 100 + FSCALE / 2= ) >> FSHIFT;
> +=C2=A0 =C2=A0 =C2=A0load =3D ((int64_t)averunnable.ldavg[0] * 100 + F= SCALE / 2) >> FSHIFT;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0sbuf_printf(&sb, "%sload: %d.%02d &= quot;, tp->t_column =3D=3D 0 ? "" : "\n",
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0load / 100, load % 100);
>=C2=A0 =C2=A0
> diff --git a/sys/sys/param.h b/sys/sys/param.h
> index 2d463b9ac7a2..b0b53f1a7776 100644
> --- a/sys/sys/param.h
> +++ b/sys/sys/param.h
> @@ -361,12 +361,12 @@ __END_DECLS
>=C2=A0 =C2=A0 * Scale factor for scaled integers used to count %cpu tim= e and load avgs.
>=C2=A0 =C2=A0 *
>=C2=A0 =C2=A0 * The number of CPU `tick's that map to a unique `%ag= e' can be expressed
> - * by the formula (1 / (2 ^ (FSHIFT - 11))).=C2=A0 The maximum load a= verage that
> - * can be calculated (assuming 32 bits) can be closely approximated u= sing
> - * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15).
> + * by the formula (1 / (2 ^ (FSHIFT - 11))).=C2=A0 Since the intermed= iate
> + * calculation is done with 64-bit precision, the maximum load averag= e that can
> + * be calculated is approximately 2^32 / FSCALE.
>=C2=A0 =C2=A0 *
>=C2=A0 =C2=A0 * For the scheduler to maintain a 1:1 mapping of CPU `tic= k' to `%age',
> - * FSHIFT must be at least 11; this gives us a maximum load avg of ~1= 024.
> + * FSHIFT must be at least 11.=C2=A0 This gives a maximum load avg of= 2 million.
>=C2=A0 =C2=A0 */
>=C2=A0 =C2=A0#define=C2=A0 =C2=A0 =C2=A0FSHIFT=C2=A0 11=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* bits to right of fixed binary point *= /
>=C2=A0 =C2=A0#define FSCALE=C2=A0 =C2=A0 =C2=A0 (1<<FSHIFT)
>


--0000000000007b638f05de61765c--