[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <000801d1f261$d69fc740$83df55c0$@net>
Date: Tue, 9 Aug 2016 10:16:56 -0700
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Rafael J. Wysocki'" <rjw@...ysocki.net>
Cc: "'Peter Zijlstra'" <peterz@...radead.org>,
"'Srinivas Pandruvada'" <srinivas.pandruvada@...ux.intel.com>,
"'Viresh Kumar'" <viresh.kumar@...aro.org>,
"'Linux Kernel Mailing List'" <linux-kernel@...r.kernel.org>,
"'Steve Muckle'" <steve.muckle@...aro.org>,
"'Juri Lelli'" <juri.lelli@....com>,
"'Ingo Molnar'" <mingo@...nel.org>,
"'Linux PM list'" <linux-pm@...r.kernel.org>
Subject: RE: [RFC][PATCH 7/7] cpufreq: intel_pstate: Change P-state selection algorithm for Core
On 2016.08.05 17:02 Rafael J. Wysocki wrote:
> On 2016.08.03 21:19 Doug Smythies wrote:
>> On 2016.07.31 16:49 Rafael J. Wysocki wrote:
>>
>>
>>> +static inline int32_t get_target_pstate_default(struct cpudata *cpu)
>>> +{
>>> + struct sample *sample = &cpu->sample;
>>> + int32_t busy_frac;
>>> + int pstate;
>>> +
>>> + busy_frac = div_fp(sample->mperf, sample->tsc);
>>> + sample->busy_scaled = busy_frac * 100;
>>> +
>>> + if (busy_frac < cpu->iowait_boost)
>>> + busy_frac = cpu->iowait_boost;
>>> +
>>> + cpu->iowait_boost >>= 1;
>>> +
>>> + pstate = cpu->pstate.turbo_pstate;
>>> + return fp_toint((pstate + (pstate >> 2)) * busy_frac);
>>> +}
>>> +
>>
>> The response curve is not normalized on the lower end to the minimum
>> pstate for the processor, meaning the overall response will vary
>> between processors as a function of minimum pstate.
> But that's OK IMO.
>
> Mapping busy_frac = 0 to the minimum P-state would over-provision workloads
> with small values of busy_frac.
Agreed, mapping busy_frac = 0 to the minimum Pstate would be a bad thing to do.
However, that is not what I meant. I meant that the mapping of busy-frac = N to
the minimum pstate for the processor should give the same "N" (within granularity),
regardless of the processor.
Example, my processor, i7-2600K: max pstate = 38; min pstate = 16.
Load before going to pstate, 17: 17 = (38 + 38/4) * load
Load = N = 35.8 %
Example, something like, i5-3337U (I think, I don't actually have one):
max pstate = 27; min pstate = 8.
Load before going to pstate, 9: 9 = (27 + 27/4) * load
Load = N = 26.7 %
It was a couple of years ago, so I should re-do the sensitivity
analysis/testing, but I concluded that the performance / energy tradeoff
was somewhat sensitive to "N".
I am suggesting that the response curve, or transfer function,
needs to be normalized, for any processor, to:
Max pstate | __________
| /
| /
| /
| /
| /
Min pstate| _______/
|__________________________
| | | |
0% N M 100%
CPU load
Currently M ~= 80%
One time (not re-based since kernel 4.3) I did have a proposed solution [1],
but it was expensive in terms of extra multiplies and divides.
[1]: http://marc.info/?l=linux-pm&m=142881187323474&w=2
>> The clamping at maximum pstate at about 80% load seems at little high
>> to me. Work I have done in various attempts to bring back the use of actual load
>> has always ended up achieving maximum pstate before 80% load for best results.
>> Even the get_target_pstate_cpu_load people reach the max pstate faster, and they
>> are more about energy than performance.
>> What was the criteria for the decision here? Are test results available for review
>> and/or duplication by others?
> This follows the coefficient used by the schedutil governor, but then the
> metric is different, so quite possibly a different value may work better here.
>
> We'll test other values before applying this for sure. :-)
I am now testing this change to the code (for M ~= 67%; N ~= 30% (my CPU); N ~= 22% (i5-3337U)):
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 8b2bdb7..909d441 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1313,7 +1313,7 @@ static inline int32_t get_target_pstate_default(struct cpudata *cpu)
cpu->iowait_boost >>= 1;
pstate = cpu->pstate.turbo_pstate;
- return fp_toint((pstate + (pstate >> 2)) * busy_frac);
+ return fp_toint((pstate + (pstate >> 1)) * busy_frac);
}
static inline void intel_pstate_update_pstate(struct cpudata *cpu, int pstate)
>
> Several tests were done with this patch set.
...[cut]...
>> Questions:
>> Is there a migration plan?
>
> Not yet. We have quite a lot of testing to do first.
>
>> i.e. will there be an attempt to merge the current cpu_load method
>> and this method into one method?
>
> Quite possibly if the results are good enough.
>
>> Then possibly the PID controller could be eliminated.
>
> Right.
I think this change is important, and I'll help with it as best as I can.
... Doug
A related CPU frequency Vs. Load graph will be sent to Rafael and Srinivas off-list.
Powered by blists - more mailing lists