[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b3c238bc-c094-bbdf-5273-0de9e55f7a15@oracle.com>
Date: Sun, 14 Jul 2019 06:46:01 +0530
From: Subhra Mazumdar <subhra.mazumdar@...cle.com>
To: Parth Shah <parth@...ux.ibm.com>, linux-kernel@...r.kernel.org
Cc: peterz@...radead.org, mingo@...hat.com, tglx@...utronix.de,
steven.sistare@...cle.com, dhaval.giani@...cle.com,
daniel.lezcano@...aro.org, vincent.guittot@...aro.org,
viresh.kumar@...aro.org, tim.c.chen@...ux.intel.com,
mgorman@...hsingularity.net
Subject: Re: [PATCH v3 5/7] sched: SIS_CORE to disable idle core search
On 7/4/19 6:04 PM, Parth Shah wrote:
> Same experiment with hackbench and with perf analysis shows increase in L1
> cache miss rate with these patches
> (Lower is better)
> Baseline(%) Patch(%)
> ----------------------- ------------- -----------
> Total Cache miss rate 17.01 19(-11%)
> L1 icache miss rate 5.45 6.7(-22%)
>
>
>
> So is is possible for idle_cpu search to try checking target_cpu first and
> then goto sliding window if not found.
> Below diff works as expected in IBM POWER9 system and resolves the problem
> of far wakeup upto large extent.
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index ff2e9b5c3ac5..fae035ce1162 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6161,6 +6161,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> u64 time, cost;
> s64 delta;
> int cpu, limit, floor, target_tmp, nr = INT_MAX;
> + struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>
> this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
> if (!this_sd)
> @@ -6198,16 +6199,22 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>
> time = local_clock();
>
> - for_each_cpu_wrap(cpu, sched_domain_span(sd), target_tmp) {
> + cpumask_and(cpus, sched_domain_span(sd), &p->cpus_allowed);
> + for_each_cpu_wrap(cpu, cpu_smt_mask(target), target) {
> + __cpumask_clear_cpu(cpu, cpus);
> + if (available_idle_cpu(cpu))
> + goto idle_cpu_exit;
> + }
> +
> + for_each_cpu_wrap(cpu, cpus, target_tmp) {
> per_cpu(next_cpu, target) = cpu;
> if (!--nr)
> return -1;
> - if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
> - continue;
> if (available_idle_cpu(cpu))
> break;
> }
>
> +idle_cpu_exit:
> time = local_clock() - time;
> cost = this_sd->avg_scan_cost;
> delta = (s64)(time - cost) / 8;
>
>
>
> Best,
> Parth
How about calling select_idle_smt before select_idle_cpu from
select_idle_sibling? That should have the same effect.
Powered by blists - more mailing lists