linux-kernel - Re: [RESEND PATCH v5 2/2] sched/fair: Scan cluster before scanning LLC in wake-up path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <0af67cbc-10a3-b5d4-d80a-e878706f8be5@huawei.com>
Date:   Thu, 21 Jul 2022 20:42:12 +0800
From:   Yicong Yang <yangyicong@...wei.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Barry Song <21cnbao@...il.com>
CC:     <yangyicong@...ilicon.com>, Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        "Gautham R. Shenoy" <gautham.shenoy@....com>,
        LKML <linux-kernel@...r.kernel.org>,
        LAK <linux-arm-kernel@...ts.infradead.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        <prime.zeng@...wei.com>,
        Jonathan Cameron <jonathan.cameron@...wei.com>,
        <ego@...ux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Linuxarm <linuxarm@...wei.com>,
        Guodong Xu <guodong.xu@...aro.org>,
        <hesham.almatary@...wei.com>, <john.garry@...wei.com>,
        Yang Shen <shenyang39@...wei.com>, <kprateek.nayak@....com>,
        Chen Yu <yu.c.chen@...el.com>, <wuyun.abel@...edance.com>
Subject: Re: [RESEND PATCH v5 2/2] sched/fair: Scan cluster before scanning
 LLC in wake-up path

On 2022/7/21 18:33, Peter Zijlstra wrote:
> On Thu, Jul 21, 2022 at 09:38:04PM +1200, Barry Song wrote:
>> On Wed, Jul 20, 2022 at 11:15 PM Peter Zijlstra <peterz@...radead.org> wrote:
>>>
>>> On Wed, Jul 20, 2022 at 04:11:50PM +0800, Yicong Yang wrote:
>>>> +     /* TODO: Support SMT system with cluster topology */
>>>> +     if (!sched_smt_active() && sd) {
>>>> +             for_each_cpu_and(cpu, cpus, sched_domain_span(sd)) {
>>>
>>> So that's no SMT and no wrap iteration..
>>>
>>> Does something like this work?
>>>
>>> ---
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -6437,6 +6437,30 @@ static int select_idle_cpu(struct task_s
>>>                 }
>>>         }
>>>
>>> +       if (IS_ENABLED(CONFIG_SCHED_CLUSTER) &&
>>> +           static_branch_unlikely(&sched_cluster_active)) {
>>> +               struct sched_domain *sdc = rcu_dereference(per_cpu(sd_cluster, target));
>>> +               if (sdc) {
>>> +                       for_each_cpu_wrap(cpu, sched_domain_span(sdc), target + 1) {
>>> +                               if (!cpumask_test_cpu(cpu, cpus))
>>> +                                       continue;
>>> +
>>> +                               if (has_idle_core) {
>>> +                                       i = select_idle_core(p, cpu, cpus, &idle_cpu);
>>> +                                       if ((unsigned int)i < nr_cpumask_bits)
>>> +                                               return i;
>>> +                               } else {
>>> +                                       if (--nr <= 0)
>>> +                                               return -1;
>>> +                                       idle_cpu = __select_idle_cpu(cpu, p);
>>> +                                       if ((unsigned int)idle_cpu < nr_cpumask_bits)
>>> +                                               break;
>>
>> Guess here it should be "return idle_cpu", but not "break". as "break"
>> will make us scan more
>> other cpus outside the cluster if we have found idle_cpu within the cluster.
>>

That can explain why the performance regress when underload.

>> Yicong,
>> Please test Peter's code with the above change.
> 
> Indeed. Sorry for that.
> 

The performance's still positive based on the tip/sched/core used in this patch's commit.
70fb5ccf2ebb ("sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg").

On numa 0:
                           tip/core                 patched
Hmean     1        345.89 (   0.00%)      398.43 *  15.19%*
Hmean     2        697.77 (   0.00%)      794.40 *  13.85%*
Hmean     4       1392.51 (   0.00%)     1577.60 *  13.29%*
Hmean     8       2800.61 (   0.00%)     3118.38 *  11.35%*
Hmean     16      5514.27 (   0.00%)     6124.51 *  11.07%*
Hmean     32     10869.81 (   0.00%)    10690.97 *  -1.65%*
Hmean     64      8315.22 (   0.00%)     8520.73 *   2.47%*
Hmean     128     6324.47 (   0.00%)     7253.65 *  14.69%*

On numa 0-1:
                           tip/core                 patched
Hmean     1        348.68 (   0.00%)      397.74 *  14.07%*
Hmean     2        693.57 (   0.00%)      795.54 *  14.70%*
Hmean     4       1369.26 (   0.00%)     1548.72 *  13.11%*
Hmean     8       2772.99 (   0.00%)     3055.54 *  10.19%*
Hmean     16      4825.83 (   0.00%)     5936.64 *  23.02%*
Hmean     32     10250.32 (   0.00%)    11780.59 *  14.93%*
Hmean     64     16309.51 (   0.00%)    19864.38 *  21.80%*
Hmean     128    13022.32 (   0.00%)    16365.43 *  25.67%*
Hmean     256    11335.79 (   0.00%)    13991.33 *  23.43%*

Hi Peter,

Do you want me to respin a v6 based on your change?

Thanks,
Yicong