Date: Thu, 6 Apr 2017 07:11:08 -0700 From: Bryan Cantrill <bryancantrill@gmail.com> To: illumos-dev <developer@lists.illumos.org> Cc: freebsd-dtrace@freebsd.org Subject: Re: [developer] dtrace: normalization of stddev Message-ID: <CAAm8y%2BiFYLUqrdOf7gXL_1JYYSarQofaxRArcA-V=%2BoKZi0JNA@mail.gmail.com> In-Reply-To: <97006cf8-369d-6649-4595-43178789feba@FreeBSD.org> References: <97006cf8-369d-6649-4595-43178789feba@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Nice! Let's turn this into a formalized test (see usr/src/cmd/dtrace/test/tst/common/aggs/tst.stddev.d for the one test that's there -- we should add this as usr/src/cmd/dtrace/test/tst/common/aggs/tst.normalizestddev.d or something) and then get the fix in. Thanks for debugging this, and sorry for whatever pain it caused! - Bryan On Thu, Apr 6, 2017 at 4:25 AM, Andriy Gapon <avg@freebsd.org> wrote: > > It seems that currently normalization of stddev aggregation is done > incorrectly. > We divide both the sum of values and the sum of their squares by the > normalization factor. But we should divide the sum of squares by the > normalization factor squared to scale the original values properly. > > --- lib/libdtrace/common/dt_consume.c > +++ lib/libdtrace/common/dt_consume.c > @@ -389,8 +389,10 @@ dt_stddev(uint64_t *data, uint64_t normal) > * The standard approximation for standard deviation is > * sqrt(average(x**2) - average(x)**2), i.e. the square root > * of the average of the squares minus the square of the average. > + * When normalizing, we should divide the sum of x**2 by normal**2. > */ > dt_divide_128(data + 2, normal, avg_of_squares); > + dt_divide_128(avg_of_squares, normal, avg_of_squares); > dt_divide_128(avg_of_squares, data[0], avg_of_squares); > > norm_avg = (int64_t)data[1] / (int64_t)normal / (int64_t)data[0]; > > > A primitive test script: > > BEGIN > { > i = 100; > @s = avg(i); > @v = stddev(i); > > i = 200; > @s = avg(i); > @v = stddev(i); > > i = 300; > @s = avg(i); > @v = stddev(i); > > i = 400; > @s = avg(i); > @v = stddev(i); > > i = 500; > @s = avg(i); > @v = stddev(i); > > i = 600; > @s = avg(i); > @v = stddev(i); > > i = 700; > @s = avg(i); > @v = stddev(i); > > i = 800; > @s = avg(i); > @v = stddev(i); > > i = 900; > @s = avg(i); > @v = stddev(i); > > printa("%@3d %@3d\n", @s, @v); > > normalize(@s, 10); > normalize(@v, 10); > printa("%@3d %@3d\n", @s, @v); > > exit(0); > > } > > Without the patch it produces: > 500 258 > 50 170 > > With the patch: > 500 258 > 50 25 > > > -- > Andriy Gapon > > > ------------------------------------------- > illumos-developer > Archives: https://www.listbox.com/member/archive/182179/=now > RSS Feed: https://www.listbox.com/member/archive/rss/182179/ > 21175001-8b3b9e0a > Modify Your Subscription: https://www.listbox.com/ > member/?member_id=21175001&id_secret=21175001-7a89b2a7 > Powered by Listbox: http://www.listbox.com >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAm8y%2BiFYLUqrdOf7gXL_1JYYSarQofaxRArcA-V=%2BoKZi0JNA>