linux-kernel - [RFT][PATCH v1 4/5] cpuidle: menu: Eliminate outliers on both ends of the sample set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2301940.iZASKD2KPV@rjwysocki.net>
Date: Thu, 06 Feb 2025 15:26:41 +0100
From: "Rafael J. Wysocki" <rjw@...ysocki.net>
To: Linux PM <linux-pm@...r.kernel.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
 Daniel Lezcano <daniel.lezcano@...aro.org>,
 Christian Loehle <christian.loehle@....com>,
 Artem Bityutskiy <artem.bityutskiy@...ux.intel.com>,
 Aboorva Devarajan <aboorvad@...ux.ibm.com>
Subject:
 [RFT][PATCH v1 4/5] cpuidle: menu: Eliminate outliers on both ends of the
 sample set

From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>

Currently, get_typical_interval() attempts to eliminate outliers at the
high end of the sample set only (probably in order to bias the prediction
toward lower values), but this it problematic because if the outliers are
present at the low end of the sample set, discarding the highest values
will not help to reduce the variance.

Since the presence of outliers at the low end of the sample set is
generally as likely as their presence at the high end of the sample
set, modify get_typical_interval() to treat samples at the largest
distances from the average (on both ends of the sample set) as outliers.

This should increase the likelihood of making a meaningful prediction
in some cases.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
---
 drivers/cpuidle/governors/menu.c |   32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -116,30 +116,37 @@
  */
 static unsigned int get_typical_interval(struct menu_device *data)
 {
-	unsigned int max, divisor, thresh = UINT_MAX;
+	s64 value, min_thresh = -1, max_thresh = UINT_MAX;
+	unsigned int max, min, divisor;
 	u64 avg, variance, avg_sq;
 	int i;

 again:
 	/* Compute the average and variance of past intervals. */
 	max = 0;
+	min = UINT_MAX;
 	avg = 0;
 	variance = 0;
 	divisor = 0;
 	for (i = 0; i < INTERVALS; i++) {
-		unsigned int value = data->intervals[i];
-
-		/* Discard data points above or at the threshold. */
-		if (value >= thresh)
+		value = data->intervals[i];
+		/*
+		 * Discard the samples outside the interval between the min and
+		 * max thresholds.
+		 */
+		if (value <= min_thresh || value >= max_thresh)
 			continue;

 		divisor++;

 		avg += value;
-		variance += (u64)value * value;
+		variance += value * value;

 		if (value > max)
 			max = value;
+
+		if (value < min)
+			min = value;
 	}

 	if (!max)
@@ -175,10 +182,10 @@
 	}

 	/*
-	 * If we have outliers to the upside in our distribution, discard
-	 * those by setting the threshold to exclude these outliers, then
+	 * If there are outliers, discard them by setting thresholds to exclude
+	 * data points at a large enough distance from the average, then
 	 * calculate the average and standard deviation again. Once we get
-	 * down to the bottom 3/4 of our samples, stop excluding samples.
+	 * down to the last 3/4 of our samples, stop excluding samples.
 	 *
 	 * This can deal with workloads that have long pauses interspersed
 	 * with sporadic activity with a bunch of short pauses.
@@ -186,7 +193,12 @@
 	if ((divisor * 4) <= INTERVALS * 3)
 		return UINT_MAX;

-	thresh = max;
+	/* Update the thresholds for the next round. */
+	if (avg - min > max - avg)
+		min_thresh = min;
+	else
+		max_thresh = max;
+
 	goto again;
 }