linux-kernel - Re: sched: ARM: arch_scale_freq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1318319852.14400.65.camel@laptop>
Date:	Tue, 11 Oct 2011 09:57:32 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Amit Kucheria <amit.kucheria@...aro.org>
Cc:	Vincent Guittot <vincent.guittot@...aro.org>,
	linux-kernel@...r.kernel.org,
	LAK <linux-arm-kernel@...ts.infradead.org>,
	linaro-dev@...ts.linaro.org
Subject: Re: sched: ARM: arch_scale_freq_power

On Tue, 2011-10-11 at 12:46 +0530, Amit Kucheria wrote:
> Adding Peter to the discussion..

Right, CCing the folks who actually wrote the code you're asking
questions about always helps ;-)

> On Thu, Oct 6, 2011 at 5:06 PM, Vincent Guittot
> <vincent.guittot@...aro.org> wrote:
> > I work to link the cpu_power of ARM cores to their frequency by using
> > arch_scale_freq_power. 

Why and how? In particular note that if you're using something like the
on-demand cpufreq governor this isn't going to work.

> It's explained in the kernel that cpu_power is
> > used to distribute load on cpus and a cpu with more cpu_power will
> > pick up more load. The default value is SCHED_POWER_SCALE and I
> > increase the value if I want a cpu to have more load than another one.
> > Is there an advised range for cpu_power value as well as some time
> > scale constraints for updating the cpu_power value ?

Basically 1024 is the unit and denotes the capacity of a full core at
'normal' speed. 

Typically cpufreq would down-clock a core and thus you'd end up with a
smaller number (linearly proportional to the freq ratio etc. although if
you want to go really fancy you could determine the actual
throughput/freq curves).

Things like x86 turbo mode would result in a >1024 value.

Things like SMT would typically result in <1024 and the SMT sum over the
core >1024 (if you're lucky).

> > I'm also wondering why this scheduler feature is currently disable by default ?

Because the only implementation in existence (x86) is broken and I
haven't gotten around to fixing it. Arguable we should disable that for
the time being, see below.

> In discussions with Vincent regarding this, I've wondered whether
> cpu_power wouldn't be better renamed to cpu_capacity since that is
> what it really seems to describe.

Possibly, but its been cpu_power for ages and we use capacity to
describe something else.

---
 arch/x86/kernel/cpu/sched.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/sched.c b/arch/x86/kernel/cpu/sched.c
index a640ae5..90ae68c 100644
--- a/arch/x86/kernel/cpu/sched.c
+++ b/arch/x86/kernel/cpu/sched.c
@@ -6,7 +6,14 @@
 #include <asm/cpufeature.h>
 #include <asm/processor.h>
 
-#ifdef CONFIG_SMP
+#if 0 /* def CONFIG_SMP */
+
+/*
+ * Currently broken, we need to filter out idle time because the aperf/mperf
+ * ratio measures actual throughput, not capacity. This means that if a logical
+ * cpu idles it will report less capacity and receive less work, which isn't
+ * what we want.
+ */
 
 static DEFINE_PER_CPU(struct aperfmperf, old_perf_sched);
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/