[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251030-b4-follow-up-v2-1-19a23c83b837@os.amperecomputing.com>
Date: Thu, 30 Oct 2025 12:19:29 -0700
From: Shubhang Kaushik via B4 Relay <devnull+shubhang.os.amperecomputing.com@...nel.org>
To: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, 
 Juri Lelli <juri.lelli@...hat.com>, 
 Vincent Guittot <vincent.guittot@...aro.org>, 
 Dietmar Eggemann <dietmar.eggemann@....com>, 
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, 
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, 
 Shubhang Kaushik <sh@...two.org>, 
 Shijie Huang <Shijie.Huang@...erecomputing.com>, 
 Frank Wang <zwang@...erecomputing.com>
Cc: Christopher Lameter <cl@...two.org>, 
 Adam Li <adam.li@...erecomputing.com>, linux-kernel@...r.kernel.org, 
 Shubhang Kaushik <shubhang@...amperecomputing.com>
Subject: [PATCH v2] sched/fair: Prefer cache locality for EAS wakeup
From: Shubhang Kaushik <shubhang@...amperecomputing.com>
When Energy Aware Scheduling (EAS) is enabled, a task waking up on a
sibling CPU might migrate away from its previous CPU even if that CPU
is not overutilized. This sacrifices cache locality and introduces
unnecessary migration overhead.
This patch refines the wakeup heuristic in `select_idle_sibling()`. If
EAS is active and the task's previous CPU (`prev`) is not overutilized,
the scheduler will prioritize waking the task on `prev`, avoiding an
unneeded migration and preserving cache-hotness.
---
v2:
- Addressed reviewer comments to handle this special condition
  within the selection logic, prioritizing the
  previous CPU if not overutilized for EAS.
- Link to v1: https://lore.kernel.org/all/20251017-b4-sched-cfs-refactor-propagate-v1-1-1eb0dc5b19b3@os.amperecomputing.com/
Signed-off-by: Shubhang Kaushik <shubhang@...amperecomputing.com>
---
 kernel/sched/fair.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 25970dbbb27959bc130d288d5f80677f75f8db8b..ac94463627778f09522fb5420f67b903a694ad4d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7847,9 +7847,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
 	    asym_fits_cpu(task_util, util_min, util_max, target))
 		return target;
 
-	/*
-	 * If the previous CPU is cache affine and idle, don't be stupid:
-	 */
+	/* Reschedule on an idle, cache-sharing sibling to preserve affinity: */
 	if (prev != target && cpus_share_cache(prev, target) &&
 	    (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
 	    asym_fits_cpu(task_util, util_min, util_max, prev)) {
@@ -7861,6 +7859,14 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
 		prev_aff = prev;
 	}
 
+	/*
+	 * If the previous CPU is not overutilized, prefer it for cache locality.
+	 * This prevents migration away from a cache-hot CPU that can still
+	 * handle the task without causing an overload.
+	 */
+	if (sched_energy_enabled() && !cpu_overutilized(prev))
+		return prev;
+
 	/*
 	 * Allow a per-cpu kthread to stack with the wakee if the
 	 * kworker thread and the tasks previous CPUs are the same.
---
base-commit: e53642b87a4f4b03a8d7e5f8507fc3cd0c595ea6
change-id: 20251030-b4-follow-up-ff03b4533a2d
Best regards,
-- 
Shubhang Kaushik <shubhang@...amperecomputing.com>
Powered by blists - more mailing lists
 
