linux-kernel - Re: [PATCH v3 16/22] sched: add power aware scheduling in fork/exec/wake

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87fw20l5sr.fsf@sejong.aot.lge.com>
Date:	Thu, 17 Jan 2013 14:47:16 +0900
From:	Namhyung Kim <namhyung@...nel.org>
To:	Morten Rasmussen <morten.rasmussen@....com>
Cc:	Alex Shi <alex.shi@...el.com>,
	"mingo\@redhat.com" <mingo@...hat.com>,
	"peterz\@infradead.org" <peterz@...radead.org>,
	"tglx\@linutronix.de" <tglx@...utronix.de>,
	"akpm\@linux-foundation.org" <akpm@...ux-foundation.org>,
	"arjan\@linux.intel.com" <arjan@...ux.intel.com>,
	"bp\@alien8.de" <bp@...en8.de>, "pjt\@google.com" <pjt@...gle.com>,
	"efault\@gmx.de" <efault@....de>,
	"vincent.guittot\@linaro.org" <vincent.guittot@...aro.org>,
	"gregkh\@linuxfoundation.org" <gregkh@...uxfoundation.org>,
	"preeti\@linux.vnet.ibm.com" <preeti@...ux.vnet.ibm.com>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 16/22] sched: add power aware scheduling in fork/exec/wake

On Wed, 16 Jan 2013 14:27:30 +0000, Morten Rasmussen wrote:
> On Wed, Jan 16, 2013 at 06:02:21AM +0000, Alex Shi wrote:
>> On 01/15/2013 12:09 AM, Morten Rasmussen wrote:
>> > On Fri, Jan 11, 2013 at 07:08:45AM +0000, Alex Shi wrote:
>> >> On 01/10/2013 11:01 PM, Morten Rasmussen wrote:
>> >> For power consideration scenario, it ask task number less than Lcpu
>> >> number, don't care the load weight, since whatever the load weight, the
>> >> task only can burn one LCPU.
>> >>
>> > 
>> > True, but you miss the opportunities for power saving when you have many
>> > light tasks (> LCPU). Currently, the sd_utils < threshold check will go
>> > for SCHED_POLICY_PERFORMANCE if the number tasks (sd_utils) is greater
>> > than the domain weight/capacity irrespective of the actual load caused
>> > by those tasks.
>> > 
>> > If you used tracked task load weight for sd_utils instead you would be
>> > able to go for power saving in scenarios with many light tasks as well.
>> 
>> yes, that's right on power consideration. but for performance consider,
>> it's better to spread tasks on different LCPU to save CS cost. And if
>> the cpu usage is nearly full, we don't know if some tasks real want more
>> cpu time.
>
> If the cpu is nearly full according to its tracked load it should not be
> used for packing more tasks. It is the nearly idle scenario that I am
> more interested in. If you have lots of task with tracked load <10% then
> why not pack them. The performance impact should be minimal.
>
> Furthermore, nr_running is just a snapshot of the current runqueue
> status. The combination of runnable and blocked load should give a
> better overall view of the cpu loads.

I have a feeling that power aware scheduling policy has to deal only
with the utilization.  Of course it only works under a certain threshold
and if it's exceeded must be changed to other policy which cares the
load weight/average.  Just throwing an idea. :)

>
>> Even in the power sched policy, we still want to get better performance
>> if it's possible. :)
>
> I agree if it comes for free in terms of power. In my opinion it is
> acceptable to sacrifice a bit of performance to save power when using a
> power sched policy as long as the performance regression can be
> justified by the power savings. It will of course depend on the system
> and its usage how trade-off power and performance. My point is just that
> with multiple sched policies (performance, balance and power as you
> propose) it should be acceptable to focus on power for the power policy
> and let users that only/mostly care about performance use the balance or
> performance policy.

Agreed.

>
>> > 
>> >>>> +
>> >>>> +		if (sched_policy == SCHED_POLICY_POWERSAVING)
>> >>>> +			threshold = sgs.group_weight;
>> >>>> +		else
>> >>>> +			threshold = sgs.group_capacity;
>> >>>
>> >>> Is group_capacity larger or smaller than group_weight on your platform?
>> >>
>> >> Guess most of your confusing come from the capacity != weight here.
>> >>
>> >> In most of Intel CPU, a cpu core's power(with 2 HT) is usually 1178, it
>> >> just bigger than a normal cpu power - 1024. but the capacity is still 1,
>> >> while the group weight is 2.
>> >>
>> > 
>> > Thanks for clarifying. To the best of my knowledge there are no
>> > guidelines for how to specify cpu power so it may be a bit dangerous to
>> > assume that capacity < weight when capacity is based on cpu power.
>> 
>> Sure. I also just got them from code. and don't know other arch how to
>> different them.
>> but currently, seems this cpu power concept works fine.
>
> Yes, it seems to work fine for your test platform. I just want to
> highlight that the assumption you make might not be valid for other
> architectures. I know that cpu power is not widely used, but that may
> change with the increasing focus on power aware scheduling.

AFAIK on ARM big.LITTLE, a big cpu will have a cpu power more than
1024.  I'm sure Morten knows way more than me on this. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/