[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251203090042.1804-1-hdanton@sina.com>
Date: Wed, 3 Dec 2025 17:00:40 +0800
From: Hillf Danton <hdanton@...a.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: peterz@...radead.org,
linux-kernel@...r.kernel.org,
pierre.gondois@....com,
kprateek.nayak@....com,
qyousef@...alina.io,
christian.loehle@....com,
luis.machado@....com
Subject: Re: [RFC PATCH 6/6 v7] sched/fair: Add EAS and idle cpu push trigger
On Tue, 2 Dec 2025 14:01:39 +0100 Vincent Guittot wrote:
>On Tue, 2 Dec 2025 at 10:45, Hillf Danton <hdanton@...a.com> wrote:
>> On Mon, 1 Dec 2025 10:13:08 +0100 Vincent Guittot wrote:
>> > EAS is based on wakeup events to efficiently place tasks on the system, but
>> > there are cases where a task doesn't have wakeup events anymore or at a far
>> > too low pace. For such cases, we check if it's worht pushing hte task on
>> > another CPUs instead of putting it back in the enqueued list.
>> >
>> > Wake up events remain the main way to migrate tasks but we now detect
>> > situation where a task is stuck on a CPU by checking that its utilization
>> > is larger than the max available compute capacity (max cpu capacity or
>> > uclamp max setting)
>> >
>> > When the system becomes overutilized and some CPUs are idle, we try to
>> > push tasks instead of waiting periodic load balance.
>> >
>> > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
>> > ---
>> > kernel/sched/fair.c | 65 +++++++++++++++++++++++++++++++++++++++++
>> > kernel/sched/topology.c | 3 ++
>> > 2 files changed, 68 insertions(+)
>> >
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index 9af8d0a61856..e9e1d0c05805 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -6990,6 +6990,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>> > }
>> >
>> > static void fair_remove_pushable_task(struct rq *rq, struct task_struct *p);
>> > +
>> > /*
>> > * Basically dequeue_task_fair(), except it can deal with dequeue_entity()
>> > * failing half-way through and resume the dequeue later.
>> > @@ -8499,8 +8500,72 @@ static inline bool sched_push_task_enabled(void)
>> > return static_branch_unlikely(&sched_push_task);
>> > }
>> >
>> > +static inline bool task_stuck_on_cpu(struct task_struct *p, int cpu)
>> > +{
>> > + unsigned long max_capa, util;
>> > +
>> > + max_capa = min(get_actual_cpu_capacity(cpu),
>> > + uclamp_eff_value(p, UCLAMP_MAX));
>> > + util = max(task_util_est(p), task_runnable(p));
>> > +
>> > + /*
>> > + * Return true only if the task might not sleep/wakeup because of a low
>> > + * compute capacity. Tasks, which wake up regularly, will be handled by
>> > + * feec().
>> > + */
>> > + return (util > max_capa);
>> > +}
>> > +
>> > +static inline bool sched_energy_push_task(struct task_struct *p, struct rq *rq)
>> > +{
>> > + if (!sched_energy_enabled())
>> > + return false;
>> > +
>> > + if (is_rd_overutilized(rq->rd))
>> > + return false;
>> > +
>> > + if (task_stuck_on_cpu(p, cpu_of(rq)))
>> > + return true;
>> > +
>> > + if (!task_fits_cpu(p, cpu_of(rq)))
>> > + return true;
>> > +
>> > + return false;
>> > +}
>> > +
>> > +static inline bool sched_idle_push_task(struct task_struct *p, struct rq *rq)
>> > +{
>> > + if (rq->nr_running == 1)
>> > + return false;
>> > +
>> > + if (!is_rd_overutilized(rq->rd))
>> > + return false;
>> > +
>> > + /* If there are idle cpus in the llc then try to push the task on it */
>> > + if (test_idle_cores(cpu_of(rq)))
>> > + return true;
>> > +
>> > + return false;
>> > +}
>> > +
>> > +
>> > static bool fair_push_task(struct rq *rq, struct task_struct *p)
>> > {
>> > + if (!task_on_rq_queued(p))
>> > + return false;
>>
>> Task is queued on rq.
>> > +
>> > + if (p->se.sched_delayed)
>> > + return false;
>> > +
>> > + if (p->nr_cpus_allowed == 1)
>> > + return false;
>> > +
>> > + if (sched_energy_push_task(p, rq))
>> > + return true;
>>
>> If task is stuck on CPU, it could not be on rq. Weird.
>
> May be it comes from my description and I should use task_stuck_on_rq
> By stuck, I mean that the task doesn't have any opportunity to migrate
> on another cpu/rq and stay "forever" (at least until next sleep) on
> this cpu/rq because load balancing is disabled/bypassed w/ EAS
> Here Stuck does not mean blocked/sleeping
>
Given task queued on rq, I find the correct phrase, stack, in the cover
letter instead of stuck, and the long-standing stacking tasks mean load
balancer fails to cure that stack. 1/7 fixes that failure, no?
Powered by blists - more mailing lists