linux-kernel - Re: [PATCH v3 5/7] sched: SIS_CORE to disable idle core search

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <b3c238bc-c094-bbdf-5273-0de9e55f7a15@oracle.com>
Date:   Sun, 14 Jul 2019 06:46:01 +0530
From:   Subhra Mazumdar <subhra.mazumdar@...cle.com>
To:     Parth Shah <parth@...ux.ibm.com>, linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, mingo@...hat.com, tglx@...utronix.de,
        steven.sistare@...cle.com, dhaval.giani@...cle.com,
        daniel.lezcano@...aro.org, vincent.guittot@...aro.org,
        viresh.kumar@...aro.org, tim.c.chen@...ux.intel.com,
        mgorman@...hsingularity.net
Subject: Re: [PATCH v3 5/7] sched: SIS_CORE to disable idle core search


On 7/4/19 6:04 PM, Parth Shah wrote:
> Same experiment with hackbench and with perf analysis shows increase in L1
> cache miss rate with these patches
> (Lower is better)
>                            Baseline(%)   Patch(%)
>   ----------------------- ------------- -----------
>    Total Cache miss rate         17.01   19(-11%)
>    L1 icache miss rate            5.45   6.7(-22%)
>
>
>
> So is is possible for idle_cpu search to try checking target_cpu first and
> then goto sliding window if not found.
> Below diff works as expected in IBM POWER9 system and resolves the problem
> of far wakeup upto large extent.
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index ff2e9b5c3ac5..fae035ce1162 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6161,6 +6161,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>          u64 time, cost;
>          s64 delta;
>          int cpu, limit, floor, target_tmp, nr = INT_MAX;
> +       struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>   
>          this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
>          if (!this_sd)
> @@ -6198,16 +6199,22 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>   
>          time = local_clock();
>   
> -       for_each_cpu_wrap(cpu, sched_domain_span(sd), target_tmp) {
> +       cpumask_and(cpus, sched_domain_span(sd), &p->cpus_allowed);
> +       for_each_cpu_wrap(cpu, cpu_smt_mask(target), target) {
> +               __cpumask_clear_cpu(cpu, cpus);
> +               if (available_idle_cpu(cpu))
> +                       goto idle_cpu_exit;
> +       }
> +
> +       for_each_cpu_wrap(cpu, cpus, target_tmp) {
>                  per_cpu(next_cpu, target) = cpu;
>                  if (!--nr)
>                          return -1;
> -               if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
> -                       continue;
>                  if (available_idle_cpu(cpu))
>                          break;
>          }
>   
> +idle_cpu_exit:
>          time = local_clock() - time;
>          cost = this_sd->avg_scan_cost;
>          delta = (s64)(time - cost) / 8;
>
>
>
> Best,
> Parth
How about calling select_idle_smt before select_idle_cpu from
select_idle_sibling? That should have the same effect.