[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZPMVcTFmtvshJRYH@gmail.com>
Date: Sat, 2 Sep 2023 12:58:57 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
Cc: mingo@...hat.com, peterz@...radead.org, vincent.guittot@...aro.org,
dietmar.eggemann@....com, vschneid@...hat.com,
linux-kernel@...r.kernel.org, srikar@...ux.vnet.ibm.com,
mgorman@...hsingularity.net, yu.c.chen@...el.com,
ricardo.neri-calderon@...ux.intel.com, iamjoonsoo.kim@....com,
tim.c.chen@...ux.intel.com, juri.lelli@...hat.com,
rocking@...ux.alibaba.com, joshdon@...gle.com
Subject: Re: [PATCH] sched/fair: optimize should_we_balance for higher SMT
systems
* Shrikanth Hegde <sshegde@...ux.vnet.ibm.com> wrote:
> should_we_balance is called in load_balance to find out if the CPU that
> is trying to do the load balance is the right one or not.
> With commit b1bfeab9b002("sched/fair: Consider the idle state of the whole
> core for load balance"), tries to find an idle core to do the load balancing
> and fallsback on an idle sibling CPU if there is no idle core.
>
> However, on larger SMT systems, it could be needlessly iterating to find a
> idle by scanning all the CPUs in an non-idle core. If the core is not idle,
> and first SMT sibling which is idle has been found, then its not needed to
> check other SMT siblings for idleness
>
> Lets say in SMT4, Core0 has 0,2,4,6 and CPU0 is BUSY and rest are IDLE.
> balancing domain is MC/DIE. CPU2 will be set as the first idle_smt and
> same process would be repeated for CPU4 and CPU6 but this is unnecessary.
> Since calling is_core_idle loops through all CPU's in the SMT mask, effect
> is multiplied by weight of smt_mask. For example,when say 1 CPU is busy,
> we would skip loop for 2 CPU's and skip iterating over 8CPU's. That
> effect would be more in DIE/NUMA domain where there are more cores.
>
> Testing and performance evaluation
> The test has been done on this system which has 12 cores, i.e 24 small
> cores with SMT=4
> lscpu
> Architecture: ppc64le
> Byte Order: Little Endian
> CPU(s): 96
> On-line CPU(s) list: 0-95
> Model name: POWER10 (architected), altivec supported
> Thread(s) per core: 8
Ok, so the performance figures are pretty convincing, and the approach
is fairly simple - so I've applied your patch to tip:sched/urgent,
to address the performance regression caused by b1bfeab9b002.
Thanks,
Ingo
Powered by blists - more mailing lists