[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <002601d17005$e6369820$b2a3c860$@net>
Date: Thu, 25 Feb 2016 11:51:18 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Stephane Gasparini'" <stephane.gasparini@...ux.intel.com>
Cc: "'Mel Gorman'" <mgorman@...hsingularity.net>,
"'Rafael Wysocki'" <rjw@...ysocki.net>,
"'Ingo Molnar'" <mingo@...nel.org>,
"'Peter Zijlstra'" <peterz@...radead.org>,
"'Matt Fleming'" <matt@...eblueprint.co.uk>,
"'Mike Galbraith'" <umgwanakikbuti@...il.com>,
"'Linux-PM'" <linux-pm@...r.kernel.org>,
"'LKML'" <linux-kernel@...r.kernel.org>,
"'Srinivas Pandruvada'" <srinivas.pandruvada@...ux.intel.com>
Subject: RE: [PATCH 1/1] intel_pstate: Increase hold-off time before busyness is scaled
Hi Steph,
On 2016.02.24 08:20 Stephane Gasparini wrote:
>> On Feb 19, 2016, at 5:38 PM, Doug Smythies <dsmythies@...us.net> wrote:
>>> On 2016.02.19 03:12 Stephane Gasparini wrote:
>>>
>>> The issue you are reporting looks like one we improved on android by using
>>> the average pstate instead of using the last requested pstate
>>>
>>> We know that this is improving the ffmpeg encoding performance when using the
>>> load algorithm.
>>>
>>> see patch attached
>>>
>>> This patch is only applied on get_target_pstate_use_cpu_load however you can give
>>> it a try on get_target_pstate_use_performance
>>
>> Yes, that type of patch works on the load based approach.
>
> I’m not talking about using average p-state in the scaled_busy computation.
> I’m talking adding the output of the PID (the number of pstate to ad or subtract)
> to the average pstate rather than adding this to the current p-sate.
For the situation we are dealing with here, that would actually make it worse,
wouldn't it?
Let's work through a real very low load example from the Mel V2 patch where
the target pstate is increased whereas it should have been decreased:
Mel patch version 2 (12X hold off added to rjw 3 patch v10 set added to kernel 4.5-rc4):
CPU: 3
Core busy: 105
Scaled busy: 143
Old pstate: 25
New pstate: 34
mperf: 52039
aperf: 55097
tsc: 335265689
freq: 3599750 KHz
Load: 0.02%
Duration (mS): 98.293
New pstate = old pstate + (scaled_busy-setpoint) * p_gain
= 25 + (143 - 97) * 0.2
= 34 (as above)
Ave pstate = max_pstate * aperf / mperf
= 34 * 55097 / 52039
= 36
Steph average pstate method added to the above:
New pstate = ave pstate + (scaled_busy-setpoint) * p_gain
= 36 + (143 - 97) * 0.2
= 45 (before clamping)
Now, just for completeness show the no Mel patch math:
Scaled busy = Core busy * max_pstate / old pstate * sample time / duration
= 105 * 34 / 25 * 10 / 98.293
= 14.53
New pstate = old pstate + (scaled_busy-setpoint) * p_gain
= 25 + (14.53 - 97) * .2
= 8.5
= 16 clamped minimum
Regardless, I coded the average pstate method and observe little
difference between it and the Mel V2 patch with limited testing.
... Doug
Powered by blists - more mailing lists