linux-kernel - Re: [PATCH] sched/fair: prefer available idle cpu in select_idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <65c19534-6eaa-a7c5-2131-b9e6cea8e7c9@amd.com>
Date: Thu, 13 Jun 2024 10:10:26 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: <zhangwei123171@...il.com>
CC: <linux-kernel@...r.kernel.org>, zhangwei123171 <zhangwei123171@...com>,
	<mingo@...hat.com>, <peterz@...radead.org>, <juri.lelli@...hat.com>,
	<vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
	<rostedt@...dmis.org>
Subject: Re: [PATCH] sched/fair: prefer available idle cpu in select_idle_core

Hello there,

On 6/12/2024 5:24 PM, zhangwei123171@...il.com wrote:
> From: zhangwei123171 <zhangwei123171@...com>
> 
> When the idle core cannot be found, the first sched idle cpu
> or first available idle cpu will be used if exsit.
> 
> We can use the available idle cpu detected later to ensure it
> can be used if exsit.

Is there any particular advantage of the same? Based on my understanding
the check exists to prevent unnecessary calls to cpumask_test_cpu() if
an idle CPU is already found. On a large core count system with a large
number of cores in the LLC domain, this may result in a lot more calls
to cpumask_test_cpu() if only one core is in fact idle and there is a
storm of wakeups.

For SMT-2 system, I believe any idle thread on a busy core would be the
same (if we consider all task to have same behavior). On a larger SMT
system, it takes more overhead to consider which core is the most idle.
Consider the following case:

o CPUs of core: 0-7; Only CPU1 is busy (i is idle, b is busy)

   +---+---+---+---+---+---+---+---+
   | i | b | i | i | i | i | i | i |
   +---+---+---+---+---+---+---+---+
         ^
   select idle core bails out at first busy CPU which is CPU1 however
   this core is only 1/8th busy.

o CPUs of core: 8-15; CPU10 to CPU15 are busy (i is idle, b is busy)

   +---+---+---+---+---+---+---+---+
   | i | i | b | b | b | b | b | b |
   +---+---+---+---+---+---+---+---+
             ^
   select idle core bails out at first busy CPU which is CPU10 however
   this core is in fact 5/8th busy.

Technically, core with CPU0 is better but with your change, we'll select
core of CPU8. Bottom line being, there does not seem to exist a good
case where selecting the last idle thread is better than selecting the
first one. The best the scheduler can do is reduce the number of calls
to cpumask_test_cpu() once an idle CPU is found unless it decides to
scan all the CPUs of the core to find the core which is the idlest and
in a large, busy system, that is a big hammer.

Thoughts?

> 
> Signed-off-by: zhangwei123171 <zhangwei123171@...com>
> ---
>   kernel/sched/fair.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 41b58387023d..653ca3ea09b6 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7341,7 +7341,7 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
>   			}
>   			break;
>   		}
> -		if (*idle_cpu == -1 && cpumask_test_cpu(cpu, cpus))
> +		if (cpumask_test_cpu(cpu, cpus))
>   			*idle_cpu = cpu;
>   	}
>   

--
Thanks and Regards,
Prateek