Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Apr 2021 02:46:42 GMT
From:      Lawrence Stewart <lstewart@FreeBSD.org>
To:        src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org
Subject:   git: 9141b58b50a8 - stable/13 - stats(3): Improve t-digest merging of samples which result in mu adjustment underflow.
Message-ID:  <202104020246.1322kgH5017795@gitrepo.freebsd.org>

next in thread | raw e-mail | index | archive | help
The branch stable/13 has been updated by lstewart:

URL: https://cgit.FreeBSD.org/src/commit/?id=9141b58b50a86a94667a79f221dcba59240ce3f4

commit 9141b58b50a86a94667a79f221dcba59240ce3f4
Author:     Lawrence Stewart <lstewart@FreeBSD.org>
AuthorDate: 2021-04-02 01:29:29 +0000
Commit:     Lawrence Stewart <lstewart@FreeBSD.org>
CommitDate: 2021-04-02 02:29:55 +0000

    stats(3): Improve t-digest merging of samples which result in mu adjustment underflow.
    
    Allow the calculation of the mu adjustment factor to underflow instead of
    rejecting the VOI sample from the digest and logging an error. This trades off
    some (currently unquantified) additional centroid error in exchange for better
    fidelity of the distribution's density, which is the right trade off at the
    moment until follow up work to better handle and track accumulated error can be
    undertaken.
    
    Obtained from:  Netflix
    MFC after:      immediately
    
    (cherry picked from commit 1eb402e47af35b3980e6bd51ec462de3a3faa2c8)
---
 sys/kern/subr_stats.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/sys/kern/subr_stats.c b/sys/kern/subr_stats.c
index 9dd874fcbcf8..946999263898 100644
--- a/sys/kern/subr_stats.c
+++ b/sys/kern/subr_stats.c
@@ -3255,9 +3255,41 @@ stats_v1_vsd_tdgst_add(enum vsd_dtype vs_dtype, struct voistatdata_tdgst *tdgst,
 		if (is32bit) {
 			ctd32 = (struct voistatdata_tdgstctd32 *)closest;
 			error = Q_QSUBQ(&x, ctd32->mu);
+			/*
+			 * The following calculation "x / (cnt + weight)"
+			 * computes the amount by which to adjust the centroid's
+			 * mu value in order to merge in the VOI sample.
+			 *
+			 * It can underflow (Q_QDIVI() returns ERANGE) when the
+			 * user centroids' fractional precision (which is
+			 * inherited by 'x') is too low to represent the result.
+			 *
+			 * A sophisticated approach to dealing with this issue
+			 * would minimise accumulation of error by tracking
+			 * underflow per centroid and making an adjustment when
+			 * a LSB's worth of underflow has accumulated.
+			 *
+			 * A simpler approach is to let the result underflow
+			 * i.e. merge the VOI sample into the centroid without
+			 * adjusting the centroid's mu, and rely on the user to
+			 * specify their t-digest with sufficient centroid
+			 * fractional precision such that the accumulation of
+			 * error from multiple underflows is of no material
+			 * consequence to the centroid's final value of mu.
+			 *
+			 * For the moment, the latter approach is employed by
+			 * simply ignoring ERANGE here.
+			 *
+			 * XXXLAS: Per-centroid underflow tracking is likely too
+			 * onerous, but it probably makes sense to accumulate a
+			 * single underflow error variable across all centroids
+			 * and report it as part of the digest to provide
+			 * additional visibility into the digest's fidelity.
+			 */
 			error = error ? error :
 			    Q_QDIVI(&x, ctd32->cnt + weight);
-			if (error || (error = Q_QADDQ(&ctd32->mu, x))) {
+			if ((error && error != ERANGE)
+			    || (error = Q_QADDQ(&ctd32->mu, x))) {
 #ifdef DIAGNOSTIC
 				KASSERT(!error, ("%s: unexpected error %d",
 				    __func__, error));
@@ -3276,7 +3308,9 @@ stats_v1_vsd_tdgst_add(enum vsd_dtype vs_dtype, struct voistatdata_tdgst *tdgst,
 			error = Q_QSUBQ(&x, ctd64->mu);
 			error = error ? error :
 			    Q_QDIVI(&x, ctd64->cnt + weight);
-			if (error || (error = Q_QADDQ(&ctd64->mu, x))) {
+			/* Refer to is32bit ERANGE discussion above. */
+			if ((error && error != ERANGE)
+			    || (error = Q_QADDQ(&ctd64->mu, x))) {
 				KASSERT(!error, ("%s: unexpected error %d",
 				    __func__, error));
 				return (error);



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202104020246.1322kgH5017795>