lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <7858EF87-0E5A-49FB-994E-16DF0727C625@antoniou-consulting.com>
Date:	Fri, 18 May 2012 19:46:09 +0300
From:	Pantelis Antoniou <panto@...oniou-consulting.com>
To:	Morten Rasmussen <Morten.Rasmussen@....com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"smuckle@...cinc.com" <smuckle@...cinc.com>,
	Juri Lelli <juri.lelli@...il.com>,
	"mingo@...e.hu" <mingo@...e.hu>,
	"linaro-sched-sig@...ts.linaro.org" 
	<linaro-sched-sig@...ts.linaro.org>,
	"rostedt@...dmis.org" <rostedt@...dmis.org>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"geoff@...radead.org" <geoff@...radead.org>,
	"efault@....de" <efault@....de>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: Plumbers: Tweaking scheduler policy micro-conf RFP


On May 18, 2012, at 7:24 PM, Morten Rasmussen wrote:

> On Fri, May 18, 2012 at 05:18:17PM +0100, Morten Rasmussen wrote:
>> On Tue, May 15, 2012 at 04:35:41PM +0100, Peter Zijlstra wrote:
>>> On Tue, 2012-05-15 at 17:05 +0200, Vincent Guittot wrote:
>>>> On 15 May 2012 15:00, Peter Zijlstra <peterz@...radead.org> wrote:
>>>>> On Tue, 2012-05-15 at 14:57 +0200, Vincent Guittot wrote:
>>>>>> 
>>>>>> Not sure that nobody cares but it's much more that scheduler,
>>>>>> load_balance and sched_mc are sensible enough that it's difficult to
>>>>>> ensure that a modification will not break everything for someone
>>>>>> else.
>>>>> 
>>>>> Thing is, its already broken, there's nothing else to break :-)
>>>>> 
>>>> 
>>>> sched_mc is the only power-aware knob in the current scheduler. It's
>>>> far from being perfect but it seems to work on some ARM platform at
>>>> least. You mentioned at the scheduler mini-summit that we need a
>>>> cleaner replacement and everybody has agreed on that point. Is anybody
>>>> working on it yet ? 
>>> 
>>> Apparently not.. 
>>> 
>>>> and can we discuss at Plumber's what this replacement would look like ?
>>> 
>>> one knob: sched_balance_policy with tri-state {performance, power, auto}
>> 
>> Interesting. What would the power policy look like? Would performance
>> and power be the two extremes of the power/performance trade-off? In
>> that case I would assume that most embedded systems would be using auto.
>> 
>>> 
>>> Where auto should likely look at things like are we on battery and
>>> co-ordinate with cpufreq muck or whatever.
>>> 
>>> Per domain knobs are insane, large multi-state knobs are insane, the
>>> existing scheme is therefore insane^2. Can you find a sysad who'd like
>>> to explore 3^3=27 states for optimal power/perf for his workload on a
>>> simple 2 socket hyper-threaded machine and 3^4=81 state space for 8
>>> sockets etc..?
>>> 
>>> As to the exact policy, I think the current 2 (load-balance + wakeup) is
>>> the sensible one..
>>> 
>>> Also, I still have this pending email from you asking about the topology
>>> setup stuff I really need to reply to.. but people keep sending me bugs
>>> reports :/
>>> 
>>> But really short, look at kernel/sched/core.c:default_topology[]
>>> 
>>> I'd like to get rid of sd_init_* into a single function like
>>> sd_numa_init(), this would mean all archs would need to do is provide a
>>> simple list of ever increasing masks that match their topology.
>>> 
>>> To aid this we can add some SDTL_flags, initially I was thinking of:
>>> 
>>> SDTL_SHARE_CORE	-- aka SMT
>>> SDTL_SHARE_CACHE	-- LLC cache domain (typically multi-core)
>>> SDTL_SHARE_MEMORY	-- NUMA-node (typically socket)
>>> 
>>> The 'performance' policy is typically to spread over shared resources so
>>> as to minimize contention on these.
>>> 
>> 
>> Would it be worth extending this architecture specification to contain
>> more information like CPU_POWER for each core? After having experimented
>> a bit with scheduling on big.LITTLE my experience is that more
>> information about the platform is needed to make proper scheduling
>> decisions. So if the topology definition is going to be more generic and
>> be set up by the architecture it could be worth adding all the bits of
>> information that the scheduler would need to that data structure.
>> 
>> With such data structure, the scheduler would only need one knob to
>> adjust the power/performance trade-off. Any thoughts?
>> 
> 
> One more thing. I have experimented with PJT's load-tracking patchset
> and found it very useful for big.LITTLE scheduling. Is there any plans
> for including them?
> 
> 	Morten
> 

One more vote for speedy integration of PJT's patches. They are working fine
as far as I can tell, and they are absolutely needed for the power aware
scheduler work.

-- Pantelis

>>> If you want to add some power we need some extra flags, maybe something
>>> like:
>>> 
>>> SDTL_SHARE_POWERLINE	-- power domain (typically socket)
>>> 
>>> so you know where the boundaries are where you can turn stuff off so you
>>> know what/where to pack bits.
>>> 
>>> Possibly we also add something like:
>>> 
>>> SDTL_PERF_SPREAD	-- spread on performance mode
>>> SDTL_POWER_PACK	-- pack on power mode
>>> 
>>> To over-ride the defaults. But ideally I'd leave those until after we've
>>> got the basics working and there is a clear need for them (with a
>>> spread/pack default for perf/power aware).
>> 
>> In my experience power optimized scheduling is quite tricky, especially
>> if you still want some level of performance. For heterogeneous
>> architecture packing might not be the best solution. Some indication of
>> the power/performance profile of each core could be useful.
>> 
>> Best regards,
>> Morten
>> 
>> 
>> _______________________________________________
>> linaro-sched-sig mailing list
>> linaro-sched-sig@...ts.linaro.org
>> http://lists.linaro.org/mailman/listinfo/linaro-sched-sig
>> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ