[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKfTPtCD9CXBhO5UivqYcLvFHGYS6+8_S6BkDsMG-o7kGirsFQ@mail.gmail.com>
Date: Tue, 24 Feb 2015 13:18:24 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Morten Rasmussen <morten.rasmussen@....com>
Cc: Peter Zijlstra <peterz@...radead.org>,
"mingo@...nel.org" <mingo@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"preeti@...ux.vnet.ibm.com" <preeti@...ux.vnet.ibm.com>,
"kamalesh@...ux.vnet.ibm.com" <kamalesh@...ux.vnet.ibm.com>,
"riel@...hat.com" <riel@...hat.com>,
"efault@....de" <efault@....de>,
"nicolas.pitre@...aro.org" <nicolas.pitre@...aro.org>,
Dietmar Eggemann <Dietmar.Eggemann@....com>,
"linaro-kernel@...ts.linaro.org" <linaro-kernel@...ts.linaro.org>
Subject: Re: [PATCH RESEND v9 00/10] sched: consolidation of CPU capacity and usage
On 24 February 2015 at 12:29, Morten Rasmussen <morten.rasmussen@....com> wrote:
> On Tue, Feb 24, 2015 at 10:38:29AM +0000, Vincent Guittot wrote:
>> On 23 February 2015 at 16:45, Morten Rasmussen <morten.rasmussen@....com> wrote:
>> > On Fri, Feb 20, 2015 at 02:54:09PM +0000, Vincent Guittot wrote:
>> >> On 20 February 2015 at 15:35, Morten Rasmussen <morten.rasmussen@....com> wrote:
>> >> > On Fri, Feb 20, 2015 at 02:13:21PM +0000, Vincent Guittot wrote:
>> >> >> On 20 February 2015 at 12:52, Morten Rasmussen <morten.rasmussen@....com> wrote:
>> >> >> > On Fri, Feb 20, 2015 at 11:34:47AM +0000, Peter Zijlstra wrote:
>> >> >> >> On Thu, Feb 19, 2015 at 12:49:40PM +0000, Morten Rasmussen wrote:
>> >> >> >>
>> >> >> >> > Also, it still not clear why patch 10 uses relative capacity reduction
>> >> >> >> > instead of absolute capacity available to CFS tasks.
>> >> >> >>
>> >> >> >> As present in your asymmetric big and small systems? Yes it would be
>> >> >> >> unfortunate to migrate a task to an idle small core when the big core is
>> >> >> >> still faster, even if reduced by rt/irq work.
>> >> >> >
>> >> >> > Yes, exactly. I don't think it would cause any harm for symmetric cases
>> >> >> > to use absolute capacity instead. Am I missing something?
>> >> >>
>> >> >> If absolute capacity is used, we will trig an active load balance from
>> >> >> little to big core each time a little has got 1 task and a big core is
>> >> >> idle whereas we only want to trig an active migration is the src_cpu's
>> >> >> capacity that is available for the cfs task is significantly reduced
>> >> >> by rt tasks.
>> >> >>
>> >> >> I can mix absolute and relative tests by 1st testing that the capacity
>> >> >> of the src is reduced and then ensure that the dst_cpu has more
>> >> >> absolute capacity than src_cpu
>> >> >
>> >> > If we use absolute capacity and check if the source cpu is fully
>> >> > utilized, wouldn't that work? We want to migrate the task if it is
>> >>
>> >> we want to trig the migration before the cpu is fully utilized by
>> >> rt/irq (which almost never occurs)
>> >
>> > I meant fully utilized by rt/irq and cfs tasks, sorry. Essentially,
>> > get_cpu_usage() ~= capacity_of(). If get_cpu_usage() is signficantly
>> > smaller than capacity_of() which is may be reduced by rt/irq
>> > utilization, there are still spare cycles and it is not strictly
>> > required to migrate tasks away using active LB. But, tasks would be
>> > moved away if the tasks are being allowed less cpu time due to rt/irq
>> > (get_cpu_usage() >= capacity_of()). Wouldn't that work? Or, do you want
>> > to migrate tasks regardless of whether there are still spare cycles
>> > available on the cpu doing rt/irq work?
>>
>> In fact, we can see perf improvement even if the cpu is not fully used
>> by thread and interrupts because the task becomes significantly
>> preempted by interruptions.
>
> Unless the tasks are the consumers of those interrupts, then it would
> harm performance to migrate them away :) I get your point though. Could
> we have a short comment stating the intentions so we don't forget in a
> couple of months?
I will add more details in the commit log
>
>>
>> >
>> > The advantage of comparing get_cpu_usage() with capacity_of() is that it
>> > would work for migrating cpu-intensive tasks away from little cpu on
>> > big.LITTLE as well. Then we don't need another almost identical check
>> > for that purpose :)
>>
>> I understand your point but the patch becomes inefficient for part of
>> the issue that it's trying to originally solve if we compare
>> get_cpu_usage with capacity_of. So we will probably need to add few
>> more tests for the issue you point out above
>
> Right. If your goal is to avoid preemptions and not just make sure that
> cpus aren't fully utilized then my proposal isn't sufficient. We will
> have to add another condition to solve the big.LITTLE capacity thing
> later. In fact we already have that somewhere deep down in the pile of
> patches I posted some weeks ago.
>
>> >> > currently being restricted by the available capacity (due to rt/irq
>> >> > work, being a little cpu, or both) and if there is a destination cpu
>> >> > with more absolute capacity available. No?
>> >>
>> >> yes, so the relative capacity (cpu_capacity vs cpu_capacity_orig)
>> >> enables us to know if the cpu is significantly used by irq/rt so it's
>> >> worth to do an active load balance of the task. Then the absolute
>> >> comparison of cpu_capacity of src_cpu vs cpu_capacity of dst_cpu
>> >> checks that the dst_cpu is a better choice
>> >>
>> >> something like :
>> >> if ((check_cpu_capacity(src_rq, sd)) &&
>> >> (capacity_of(src_cpu)*sd->imbalce_pct < capacity_of(dst_cpu)*100))
>> >> return 1;
>> >
>> > It should solve the big.LITTLE issue. Though I would prefer
>> > get_cpu_usage() ~= capacity_of() approach as it could even improve
>> > performance on big.LITTLE.
>>
>> ok. IMHO, it's worth having a dedicated patch for this issue
>
> Fine by me as long as we get the extra check you proposed above to fix
> the big.LITTLE issue.
ok
>
> Morten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists