[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240623015057.3383223-3-tj@kernel.org>
Date: Sat, 22 Jun 2024 15:50:21 -1000
From: Tejun Heo <tj@...nel.org>
To: torvalds@...ux-foundation.org
Cc: void@...ifault.com,
mingo@...hat.com,
peterz@...radead.org,
tglx@...utronix.de,
linux-kernel@...r.kernel.org,
kernel-team@...a.com,
Tejun Heo <tj@...nel.org>
Subject: [PATCH 2/3] sched, sched_ext: Open code for_balance_class_range()
For flexibility, sched_ext allows the BPF scheduler to select the CPU to
execute a task on at dispatch time so that e.g. a queue can be shared across
multiple CPUs. To enable this, the dispatch path is executed from balance()
so that a dispatched task can be hot-migrated to its target CPU. This means
that sched_ext needs its balance() method invoked before every
pick_next_task() even when the CPU is waking up from SCHED_IDLE.
for_balance_class_range() defined in kernel/sched/ext.h implements this
selective iteration promotion. However, the indirection obfuscates more than
helps. Open code the iteration promotion in put_prev_task_balance() and
remove for_balance_class_range().
No functional changes intended.
Signed-off-by: Tejun Heo <tj@...nel.org>
Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: David Vernet <void@...ifault.com>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
---
kernel/sched/core.c | 14 +++++++++++++-
kernel/sched/ext.h | 9 ---------
2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1092955a7d6e..827e0dc78ea0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5834,7 +5834,19 @@ static void put_prev_task_balance(struct rq *rq, struct task_struct *prev,
struct rq_flags *rf)
{
#ifdef CONFIG_SMP
+ const struct sched_class *start_class = prev->sched_class;
const struct sched_class *class;
+
+#ifdef CONFIG_SCHED_CLASS_EXT
+ /*
+ * SCX requires a balance() call before every pick_next_task() including
+ * when waking up from SCHED_IDLE. If @start_class is below SCX, start
+ * from SCX instead.
+ */
+ if (sched_class_above(&ext_sched_class, start_class))
+ start_class = &ext_sched_class;
+#endif
+
/*
* We must do the balancing pass before put_prev_task(), such
* that when we release the rq->lock the task is in the same
@@ -5843,7 +5855,7 @@ static void put_prev_task_balance(struct rq *rq, struct task_struct *prev,
* We can terminate the balance pass as soon as we know there is
* a runnable task of @class priority or higher.
*/
- for_balance_class_range(class, prev->sched_class, &idle_sched_class) {
+ for_active_class_range(class, start_class, &idle_sched_class) {
if (class->balance(rq, prev, rf))
break;
}
diff --git a/kernel/sched/ext.h b/kernel/sched/ext.h
index 229007693504..1d7837bdfaba 100644
--- a/kernel/sched/ext.h
+++ b/kernel/sched/ext.h
@@ -68,14 +68,6 @@ static inline const struct sched_class *next_active_class(const struct sched_cla
#define for_each_active_class(class) \
for_active_class_range(class, __sched_class_highest, __sched_class_lowest)
-/*
- * SCX requires a balance() call before every pick_next_task() call including
- * when waking up from idle.
- */
-#define for_balance_class_range(class, prev_class, end_class) \
- for_active_class_range(class, (prev_class) > &ext_sched_class ? \
- &ext_sched_class : (prev_class), (end_class))
-
#ifdef CONFIG_SCHED_CORE
bool scx_prio_less(const struct task_struct *a, const struct task_struct *b,
bool in_fi);
@@ -100,7 +92,6 @@ static inline bool task_on_scx(const struct task_struct *p) { return false; }
static inline void init_sched_ext_class(void) {}
#define for_each_active_class for_each_class
-#define for_balance_class_range for_class_range
#endif /* CONFIG_SCHED_CLASS_EXT */
--
2.45.2
Powered by blists - more mailing lists