[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtDnxDV8iykhdaEO8Cj1RKYbh-H+XndRyrGSuqaZrfr21Q@mail.gmail.com>
Date: Mon, 9 Feb 2026 14:21:58 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Qais Yousef <qyousef@...alina.io>
Cc: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, linux-kernel@...r.kernel.org,
pierre.gondois@....com, kprateek.nayak@....com, hongyan.xia2@....com,
christian.loehle@....com, luis.machado@....com
Subject: Re: [PATCH 3/6 v8] sched/fair: Prepare select_task_rq_fair() to be
called for new cases
On Fri, 6 Feb 2026 at 19:03, Qais Yousef <qyousef@...alina.io> wrote:
>
> On 12/02/25 19:12, Vincent Guittot wrote:
> > Update select_task_rq_fair() to be called out of the 3 current cases which
> > are :
> > - wake up
> > - exec
> > - fork
> >
> > We wants to select a rq in some new cases like pushing a runnable task on a
> > better CPU than the local one. In such case, it's not a wakeup , nor an
> > exec nor a fork. We make sure to not distrub these cases but still
> > go through EAS and fast-path.
>
> I'd add we have a fallback mechanism when moving between cpusets causes to pick
> a random cpu. We have been carrying out of tree hack in Android for a while to
> make this use the wake up path. Especially on HMP system, a random cpu could
> mean bad placement decision as not all cores are equal. And it seems server
> market is catching up with quirky caching systems.
I will look have a look at this case
>
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
>
> Reviewed-by: Qais Yousef <qyousef@...alina.io>
>
> > ---
> > kernel/sched/fair.c | 22 ++++++++++++++--------
> > 1 file changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index f430ec890b72..80c4131fb35b 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8518,6 +8518,7 @@ static int
> > select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
> > {
> > int sync = (wake_flags & WF_SYNC) && !(current->flags & PF_EXITING);
> > + int want_sibling = !(wake_flags & (WF_EXEC | WF_FORK));
> > struct sched_domain *tmp, *sd = NULL;
> > int cpu = smp_processor_id();
> > int new_cpu = prev_cpu;
> > @@ -8535,16 +8536,21 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
> > if ((wake_flags & WF_CURRENT_CPU) &&
> > cpumask_test_cpu(cpu, p->cpus_ptr))
> > return cpu;
> > + }
> >
> > - if (!is_rd_overutilized(this_rq()->rd)) {
> > - new_cpu = find_energy_efficient_cpu(p, prev_cpu);
> > - if (new_cpu >= 0)
> > - return new_cpu;
> > - new_cpu = prev_cpu;
> > - }
> > + /*
> > + * We don't want EAS to be called for exec or fork but it should be
> > + * called for any other case such as wake up or push callback.
> > + */
> > + if (!is_rd_overutilized(this_rq()->rd) && want_sibling) {
> > + new_cpu = find_energy_efficient_cpu(p, prev_cpu);
> > + if (new_cpu >= 0)
> > + return new_cpu;
> > + new_cpu = prev_cpu;
> > + }
> >
> > + if (wake_flags & WF_TTWU)
> > want_affine = !wake_wide(p) && cpumask_test_cpu(cpu, p->cpus_ptr);
> > - }
> >
> > rcu_read_lock();
> > for_each_domain(cpu, tmp) {
> > @@ -8575,7 +8581,7 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
> > if (unlikely(sd)) {
> > /* Slow path */
> > new_cpu = sched_balance_find_dst_cpu(sd, p, cpu, prev_cpu, sd_flag);
> > - } else if (wake_flags & WF_TTWU) { /* XXX always ? */
> > + } else if (want_sibling) {
> > /* Fast path */
> > new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);
> > }
> > --
> > 2.43.0
> >
Powered by blists - more mailing lists