[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230207051105.11575-21-ricardo.neri-calderon@linux.intel.com>
Date: Mon, 6 Feb 2023 21:11:01 -0800
From: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To: "Peter Zijlstra (Intel)" <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>
Cc: Ricardo Neri <ricardo.neri@...el.com>,
"Ravi V. Shankar" <ravi.v.shankar@...el.com>,
Ben Segall <bsegall@...gle.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Len Brown <len.brown@...el.com>, Mel Gorman <mgorman@...e.de>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Steven Rostedt <rostedt@...dmis.org>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Valentin Schneider <vschneid@...hat.com>,
Lukasz Luba <lukasz.luba@....com>,
Ionela Voinescu <ionela.voinescu@....com>, x86@...nel.org,
"Joel Fernandes (Google)" <joel@...lfernandes.org>,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
"Tim C . Chen" <tim.c.chen@...el.com>
Subject: [PATCH v3 20/24] sched/fair: Introduce sched_smt_siblings_idle()
X86 needs to know the idle state of the SMT siblings of a CPU to improve
the accuracy of IPCC classification. X86 implements support for IPC classes
in the thermal HFI driver.
Rename is_core_idle() as sched_smt_siblings_idle() and make it available
outside the scheduler code.
Cc: Ben Segall <bsegall@...gle.com>
Cc: Daniel Bristot de Oliveira <bristot@...hat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Len Brown <len.brown@...el.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Tim C. Chen <tim.c.chen@...el.com>
Cc: Valentin Schneider <vschneid@...hat.com>
Cc: x86@...nel.org
Cc: linux-kernel@...r.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
---
is_core_idle() is no longer an inline function after this patch. To rule
out performance degradation, I compared the execution time of the inline
and non-inline versions on a 4-socket Cascade Lake system using the NUMA
stressor of stress-ng:
$ stress-ng --numa 1500 -t 10m
is_core_idle() was called ~200,000 times. I measured the value of the TSC
counter before and after calling is_core_idle() and computed the delta
value.
I arbitrarily removed outliers (defined as any delta larger than 5000
counts). This required removing ~40 samples.
The table below summarizes the difference in execution time. All quantities
are expressed in TSC counts, except the standard deviation, expressed as a
percentage of the average.
Average Median Std(%) Mode
TSCdelta inline 668.76 626 67.24 42
TSCdelta non-inline 677.64 624 67.67 46
All metrics are similar for the inline and non-inline cases.
---
Changes since v2:
* Brought back this previously dropped patch.
* Profiled inline vs non-inline is_core_idle(). I found not major penalty.
* Merged is_core_idle() and sched_smt_siblings_idle() into a single
function. (Dietmar)
Changes since v1:
* Dropped this patch.
---
include/linux/sched.h | 2 ++
kernel/sched/fair.c | 21 +++++++++++++++------
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 45f28a601b3d..7ef9fd84e7ad 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2449,4 +2449,6 @@ static inline void sched_core_fork(struct task_struct *p) { }
extern void sched_set_stop_task(int cpu, struct task_struct *stop);
+extern bool sched_smt_siblings_idle(int cpu);
+
#endif
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d3c22dc145f7..a66d86c5cb5c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1064,7 +1064,14 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
* Scheduling class queueing methods:
*/
-static inline bool is_core_idle(int cpu)
+/**
+ * sched_smt_siblings_idle - Check whether SMT siblings of a CPU are idle
+ * @cpu: The CPU to check
+ *
+ * Returns true if all the SMT siblings of @cpu are idle or @cpu does not have
+ * SMT siblings. The idle state of @cpu is not considered.
+ */
+bool sched_smt_siblings_idle(int cpu)
{
#ifdef CONFIG_SCHED_SMT
int sibling;
@@ -1767,7 +1774,7 @@ static inline int numa_idle_core(int idle_core, int cpu)
* Prefer cores instead of packing HT siblings
* and triggering future load balancing.
*/
- if (is_core_idle(cpu))
+ if (sched_smt_siblings_idle(cpu))
idle_core = cpu;
return idle_core;
@@ -9518,7 +9525,8 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sg_lb_stats *sgs
* If the destination CPU has SMT siblings, env->idle != CPU_NOT_IDLE
* is not sufficient. We need to make sure the whole core is idle.
*/
- if (sds->local->flags & SD_SHARE_CPUCAPACITY && !is_core_idle(env->dst_cpu))
+ if (sds->local->flags & SD_SHARE_CPUCAPACITY &&
+ !sched_smt_siblings_idle(env->dst_cpu))
return false;
/* Only do SMT checks if either local or candidate have SMT siblings. */
@@ -10687,7 +10695,8 @@ static struct rq *find_busiest_queue(struct lb_env *env,
sched_asym_prefer(i, env->dst_cpu) &&
nr_running == 1) {
if (env->sd->flags & SD_SHARE_CPUCAPACITY ||
- (!(env->sd->flags & SD_SHARE_CPUCAPACITY) && is_core_idle(i)))
+ (!(env->sd->flags & SD_SHARE_CPUCAPACITY) &&
+ sched_smt_siblings_idle(i)))
continue;
}
@@ -10816,7 +10825,7 @@ asym_active_balance(struct lb_env *env)
* busy sibling.
*/
return sched_asym_prefer(env->dst_cpu, env->src_cpu) ||
- !is_core_idle(env->src_cpu);
+ !sched_smt_siblings_idle(env->src_cpu);
}
return false;
@@ -11563,7 +11572,7 @@ static void nohz_balancer_kick(struct rq *rq)
*/
if (sd->flags & SD_SHARE_CPUCAPACITY ||
(!(sd->flags & SD_SHARE_CPUCAPACITY) &&
- is_core_idle(i))) {
+ sched_smt_siblings_idle(i))) {
flags = NOHZ_STATS_KICK | NOHZ_BALANCE_KICK;
goto unlock;
}
--
2.25.1
Powered by blists - more mailing lists