linux-kernel - Re: [RFC PATCH 5/5] sched/fair: Add push task callback for EAS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ff2be1d5-ace4-4d8a-9894-4ccc16c84d06@arm.com>
Date: Fri, 13 Sep 2024 18:08:21 +0200
From: Pierre Gondois <pierre.gondois@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
 peterz@...radead.org, juri.lelli@...hat.com, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 vschneid@...hat.com, lukasz.luba@....com, rafael.j.wysocki@...el.com,
 linux-kernel@...r.kernel.org
Cc: qyousef@...alina.io, hongyan.xia2@....com
Subject: Re: [RFC PATCH 5/5] sched/fair: Add push task callback for EAS

Hello Vincent,

On 8/30/24 15:03, Vincent Guittot wrote:
> EAS is based on wakeup events to efficiently place tasks on the system, but
> there are cases where a task will not have wakeup events anymore or at a
> far too low pace. For such situation, we can take advantage of the task
> being put back in the enqueued list to check if it should be migrated on
> another CPU. When the task is the only one running on the CPU, the tick
> will check it the task is stuck on this CPU and should migrate on another
> one.
> 
> Wake up events remain the main way to migrate tasks but we now detect
> situation where a task is stuck on a CPU by checking that its utilization
> is larger than the max available compute capacity (max cpu capacity or
> uclamp max setting)
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> ---
>   kernel/sched/fair.c  | 211 +++++++++++++++++++++++++++++++++++++++++++
>   kernel/sched/sched.h |   2 +
>   2 files changed, 213 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e46af2416159..41fb18ac118b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c

[...]

> +
> +static inline void check_misfit_cpu(struct task_struct *p, struct rq *rq)
> +{
> +	int new_cpu, cpu = cpu_of(rq);
> +
> +	if (!sched_energy_enabled())
> +		return;
> +
> +	if (WARN_ON(!p))
> +		return;
> +
> +	if (WARN_ON(p != rq->curr))
> +		return;
> +
> +	if (is_migration_disabled(p))
> +		return;
> +
> +	if ((rq->nr_running > 1) || (p->nr_cpus_allowed == 1))

If the goal is to detect tasks that should be migrated to bigger CPUs,
couldn't the check be changed from:
-  (p->nr_cpus_allowed == 1)
to
- (p->max_allowed_capacity == arch_scale_cpu_capacity(cpu))
to avoid the case where a task is bound to the little cluster for instance ?

Similar question for update_misfit_status(), doesn't:
- (arch_scale_cpu_capacity(cpu) == p->max_allowed_capacity)
include this case:
- (p->nr_cpus_allowed == 1)


> +		return;
> +
> +	if (!task_misfit_cpu(p, cpu))
> +		return;

task_misfit_cpu() intends to check whether the task will have an opportunity
to run feec() though wakeups/push-pull.

Shouldn't we check whether the task fits the CPU with the 20% margin
with task_fits_cpu() aswell ? This would allow to migrate the task
faster than the load_balancer.


> +
> +	new_cpu = find_energy_efficient_cpu(p, cpu);
> +
> +	if (new_cpu == cpu)
> +		return;
> +
> +	/*
> +	 * ->active_balance synchronizes accesses to
> +	 * ->active_balance_work.  Once set, it's cleared
> +	 * only after active load balance is finished.
> +	 */
> +	if (!rq->active_balance) {
> +		rq->active_balance = 1;
> +		rq->push_cpu = new_cpu;
> +	} else
> +		return;
> +
> +	raw_spin_rq_unlock(rq);
> +	stop_one_cpu_nowait(cpu,
> +		active_load_balance_cpu_stop, rq,
> +		&rq->active_balance_work);
> +	raw_spin_rq_lock(rq);
> +}
> +

Regards,
Pierre