Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Apr 2017 07:11:08 -0700
From:      Bryan Cantrill <bryancantrill@gmail.com>
To:        illumos-dev <developer@lists.illumos.org>
Cc:        freebsd-dtrace@freebsd.org
Subject:   Re: [developer] dtrace: normalization of stddev
Message-ID:  <CAAm8y%2BiFYLUqrdOf7gXL_1JYYSarQofaxRArcA-V=%2BoKZi0JNA@mail.gmail.com>
In-Reply-To: <97006cf8-369d-6649-4595-43178789feba@FreeBSD.org>
References:  <97006cf8-369d-6649-4595-43178789feba@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Nice!  Let's turn this into a formalized test (see
usr/src/cmd/dtrace/test/tst/common/aggs/tst.stddev.d for the one test
that's there -- we should add this as
usr/src/cmd/dtrace/test/tst/common/aggs/tst.normalizestddev.d or something)
and then get the fix in.  Thanks for debugging this, and sorry for whatever
pain it caused!

         - Bryan


On Thu, Apr 6, 2017 at 4:25 AM, Andriy Gapon <avg@freebsd.org> wrote:

>
> It seems that currently normalization of stddev aggregation is done
> incorrectly.
> We divide both the sum of values and the sum of their squares by the
> normalization factor.  But we should divide the sum of squares by the
> normalization factor squared to scale the original values properly.
>
> --- lib/libdtrace/common/dt_consume.c
> +++ lib/libdtrace/common/dt_consume.c
> @@ -389,8 +389,10 @@ dt_stddev(uint64_t *data, uint64_t normal)
>          * The standard approximation for standard deviation is
>          * sqrt(average(x**2) - average(x)**2), i.e. the square root
>          * of the average of the squares minus the square of the average.
> +        * When normalizing, we should divide the sum of x**2 by normal**2.
>          */
>         dt_divide_128(data + 2, normal, avg_of_squares);
> +       dt_divide_128(avg_of_squares, normal, avg_of_squares);
>         dt_divide_128(avg_of_squares, data[0], avg_of_squares);
>
>         norm_avg = (int64_t)data[1] / (int64_t)normal / (int64_t)data[0];
>
>
> A primitive test script:
>
> BEGIN
> {
>     i = 100;
>     @s = avg(i);
>     @v = stddev(i);
>
>     i = 200;
>     @s = avg(i);
>     @v = stddev(i);
>
>     i = 300;
>     @s = avg(i);
>     @v = stddev(i);
>
>     i = 400;
>     @s = avg(i);
>     @v = stddev(i);
>
>     i = 500;
>     @s = avg(i);
>     @v = stddev(i);
>
>     i = 600;
>     @s = avg(i);
>     @v = stddev(i);
>
>     i = 700;
>     @s = avg(i);
>     @v = stddev(i);
>
>     i = 800;
>     @s = avg(i);
>     @v = stddev(i);
>
>     i = 900;
>     @s = avg(i);
>     @v = stddev(i);
>
>     printa("%@3d %@3d\n", @s, @v);
>
>     normalize(@s, 10);
>     normalize(@v, 10);
>     printa("%@3d %@3d\n", @s, @v);
>
>     exit(0);
>
> }
>
> Without the patch it produces:
> 500 258
>  50 170
>
> With the patch:
> 500 258
>  50  25
>
>
> --
> Andriy Gapon
>
>
> -------------------------------------------
> illumos-developer
> Archives: https://www.listbox.com/member/archive/182179/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/182179/
> 21175001-8b3b9e0a
> Modify Your Subscription: https://www.listbox.com/
> member/?member_id=21175001&id_secret=21175001-7a89b2a7
> Powered by Listbox: http://www.listbox.com
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAm8y%2BiFYLUqrdOf7gXL_1JYYSarQofaxRArcA-V=%2BoKZi0JNA>