[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0e4151ea-4952-8384-4877-ce149577f2ef@oracle.com>
Date: Tue, 3 Oct 2017 17:21:30 -0700
From: Rohit Jain <rohit.k.jain@...cle.com>
To: Joel Fernandes <joelaf@...gle.com>
Cc: LKML <linux-kernel@...r.kernel.org>, eas-dev@...ts.linaro.org,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Atish Patra <atish.patra@...cle.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <morten.rasmussen@....com>
Subject: Re: [PATCH 2/3] sched/fair: Introduce scaled capacity awareness in
select_idle_sibling code path
Hi Joel,
On 10/02/2017 09:52 PM, Joel Fernandes wrote:
> Hi Rohit,
>
> On Thu, Sep 28, 2017 at 8:09 AM, Rohit Jain <rohit.k.jain@...cle.com> wrote:
> [..]
>>>> With this case, because we know from the past avg, one of the strands is
>>>> running low on capacity, I am trying to return a better strand for the
>>>> thread to start on.
>>>>
>>> I know what you're trying to do but they way you've retrofitted it into
>>> the
>>> core looks weird (to me) and makes the code unreadable and ugly IMO.
>>>
>>> Why not do something simpler like skip the core if any SMT thread has been
>>> running at lesser capacity? I'm not sure if this works great or if the
>>> maintainers
>>> will prefer your or my below approach, but I find the below diff much
>>> cleaner
>>> for the select_idle_core bit. It also makes more sense since resources are
>>> shared at SMT level so makes sense to me to skip the core altogether for
>>> this:
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 6ee7242dbe0a..f324a84e29f1 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -5738,14 +5738,17 @@ static int select_idle_core(struct task_struct *p,
>>> struct sched_domain *sd, int
>>> for_each_cpu_wrap(core, cpus, target) {
>>> bool idle = true;
>>> + bool full_cap = true;
>>> for_each_cpu(cpu, cpu_smt_mask(core)) {
>>> cpumask_clear_cpu(cpu, cpus);
>>> if (!idle_cpu(cpu))
>>> idle = false;
>>> + if (!full_capacity(cpu))
>>> + full_cap = false;
>>> }
>>> - if (idle)
>>> + if (idle && full_cap)
>>> return core;
>>> }
>>>
>>
>>
>> Well, with your changes you will skip over fully idle cores which is not
>> an ideal thing either. I see that you were advocating for select
>> idle+lowest capacity core, whereas I was stopping at the first idlecore.
>>
>> Since the whole philosophy till now in this patch is "Don't spare an
>> idle CPU", I think the following diff might look better to you. Please
>> note this is only for discussion sakes, I haven't fully tested it yet.
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index ec15e5f..c2933eb 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6040,7 +6040,9 @@ void __update_idle_core(struct rq *rq)
>> static int select_idle_core(struct task_struct *p, struct sched_domain *sd,
>> int target)
>> {
>> struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>> - int core, cpu;
>> + int core, cpu, rcpu, backup_core;
>> +
>> + rcpu = backup_core = -1;
>>
>> if (!static_branch_likely(&sched_smt_present))
>> return -1;
>> @@ -6052,15 +6054,34 @@ static int select_idle_core(struct task_struct *p,
>> struct sched_domain *sd, int
>>
>> for_each_cpu_wrap(core, cpus, target) {
>> bool idle = true;
>> + bool full_cap = true;
>>
>> for_each_cpu(cpu, cpu_smt_mask(core)) {
>> cpumask_clear_cpu(cpu, cpus);
>> if (!idle_cpu(cpu))
>> idle = false;
>> +
>> + if (!full_capacity(cpu)) {
>> + full_cap = false;
>> + }
>> }
>>
>> - if (idle)
>> + if (idle && full_cap)
>> return core;
>> + else if (idle && backup_core == -1)
>> + backup_core = core;
>> + }
>> +
>> + if (backup_core != -1) {
>> + for_each_cpu(cpu, cpu_smt_mask(backup_core)) {
>> + if (full_capacity(cpu))
>> + return cpu;
>> + else if ((rcpu == -1) ||
>> + (capacity_of(cpu) > capacity_of(rcpu)))
>> + rcpu = cpu;
>> + }
>> +
>> + return rcpu;
>> }
>>
>>
>> Do let me know what you think.
> I think that if there isn't a benefit in your tests in doing the above
> vs the simpler approach, then I prefer the simpler approach especially
> since there's no point/benefit in complicating the code for
> select_idle_core.
Fair enough!
If there are no more concerns in this version, then I will go ahead and
try out all that is discussed in this version and send an updated
version. Please let me know if there are any other concerns/feedback.
Thanks,
Rohit
>
> thanks,
>
> - Joel
Powered by blists - more mailing lists