[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65c19534-6eaa-a7c5-2131-b9e6cea8e7c9@amd.com>
Date: Thu, 13 Jun 2024 10:10:26 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: <zhangwei123171@...il.com>
CC: <linux-kernel@...r.kernel.org>, zhangwei123171 <zhangwei123171@...com>,
<mingo@...hat.com>, <peterz@...radead.org>, <juri.lelli@...hat.com>,
<vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
<rostedt@...dmis.org>
Subject: Re: [PATCH] sched/fair: prefer available idle cpu in select_idle_core
Hello there,
On 6/12/2024 5:24 PM, zhangwei123171@...il.com wrote:
> From: zhangwei123171 <zhangwei123171@...com>
>
> When the idle core cannot be found, the first sched idle cpu
> or first available idle cpu will be used if exsit.
>
> We can use the available idle cpu detected later to ensure it
> can be used if exsit.
Is there any particular advantage of the same? Based on my understanding
the check exists to prevent unnecessary calls to cpumask_test_cpu() if
an idle CPU is already found. On a large core count system with a large
number of cores in the LLC domain, this may result in a lot more calls
to cpumask_test_cpu() if only one core is in fact idle and there is a
storm of wakeups.
For SMT-2 system, I believe any idle thread on a busy core would be the
same (if we consider all task to have same behavior). On a larger SMT
system, it takes more overhead to consider which core is the most idle.
Consider the following case:
o CPUs of core: 0-7; Only CPU1 is busy (i is idle, b is busy)
+---+---+---+---+---+---+---+---+
| i | b | i | i | i | i | i | i |
+---+---+---+---+---+---+---+---+
^
select idle core bails out at first busy CPU which is CPU1 however
this core is only 1/8th busy.
o CPUs of core: 8-15; CPU10 to CPU15 are busy (i is idle, b is busy)
+---+---+---+---+---+---+---+---+
| i | i | b | b | b | b | b | b |
+---+---+---+---+---+---+---+---+
^
select idle core bails out at first busy CPU which is CPU10 however
this core is in fact 5/8th busy.
Technically, core with CPU0 is better but with your change, we'll select
core of CPU8. Bottom line being, there does not seem to exist a good
case where selecting the last idle thread is better than selecting the
first one. The best the scheduler can do is reduce the number of calls
to cpumask_test_cpu() once an idle CPU is found unless it decides to
scan all the CPUs of the core to find the core which is the idlest and
in a large, busy system, that is a big hammer.
Thoughts?
>
> Signed-off-by: zhangwei123171 <zhangwei123171@...com>
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 41b58387023d..653ca3ea09b6 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7341,7 +7341,7 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
> }
> break;
> }
> - if (*idle_cpu == -1 && cpumask_test_cpu(cpu, cpus))
> + if (cpumask_test_cpu(cpu, cpus))
> *idle_cpu = cpu;
> }
>
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists