linux-kernel - Re: [PATCH] sched/fair: optimize should_we

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e96c04b2-0c63-74f7-b7ca-357177ef3eb7@linux.vnet.ibm.com>
Date:   Wed, 6 Sep 2023 07:18:09 +0530
From:   Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     mingo@...hat.com, peterz@...radead.org, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org, srikar@...ux.vnet.ibm.com,
        mgorman@...hsingularity.net, yu.c.chen@...el.com,
        ricardo.neri-calderon@...ux.intel.com, iamjoonsoo.kim@....com,
        tim.c.chen@...ux.intel.com, juri.lelli@...hat.com,
        rocking@...ux.alibaba.com, joshdon@...gle.com
Subject: Re: [PATCH] sched/fair: optimize should_we_balance for higher SMT
 systems



On 9/2/23 4:28 PM, Ingo Molnar wrote:
> 
> * Shrikanth Hegde <sshegde@...ux.vnet.ibm.com> wrote:
> 
>> should_we_balance is called in load_balance to find out if the CPU that
>> is trying to do the load balance is the right one or not.
>> With commit b1bfeab9b002("sched/fair: Consider the idle state of the whole
>> core for load balance"), tries to find an idle core to do the load balancing
>> and fallsback on an idle sibling CPU if there is no idle core.
>>
>> However, on larger SMT systems, it could be needlessly iterating to find a
>> idle by scanning all the CPUs in an non-idle core. If the core is not idle,
>> and first SMT sibling which is idle has been found, then its not needed to
>> check other SMT siblings for idleness
>>
>> Lets say in SMT4, Core0 has 0,2,4,6 and CPU0 is BUSY and rest are IDLE.
>> balancing domain is MC/DIE. CPU2 will be set as the first idle_smt and
>> same process would be repeated for CPU4 and CPU6 but this is unnecessary.
>> Since calling is_core_idle loops through all CPU's in the SMT mask, effect
>> is multiplied by weight of smt_mask. For example,when say 1 CPU is busy,
>> we would skip loop for 2 CPU's and skip iterating over 8CPU's. That
>> effect would be more in DIE/NUMA domain where there are more cores.
>>
>> Testing and performance evaluation
>> The test has been done on this system which has 12 cores, i.e 24 small
>> cores with SMT=4
>> lscpu
>> Architecture:            ppc64le
>>   Byte Order:            Little Endian
>> CPU(s):                  96
>>   On-line CPU(s) list:   0-95
>> Model name:              POWER10 (architected), altivec supported
>>   Thread(s) per core:    8
> 
> Ok, so the performance figures are pretty convincing, and the approach
> is fairly simple - so I've applied your patch to tip:sched/urgent,
> to address the performance regression caused by b1bfeab9b002.
> 
> Thanks,
> 
> 	Ingo

Thank you Ingo.