lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <002601d17005$e6369820$b2a3c860$@net>
Date:	Thu, 25 Feb 2016 11:51:18 -0800
From:	"Doug Smythies" <dsmythies@...us.net>
To:	"'Stephane Gasparini'" <stephane.gasparini@...ux.intel.com>
Cc:	"'Mel Gorman'" <mgorman@...hsingularity.net>,
	"'Rafael Wysocki'" <rjw@...ysocki.net>,
	"'Ingo Molnar'" <mingo@...nel.org>,
	"'Peter Zijlstra'" <peterz@...radead.org>,
	"'Matt Fleming'" <matt@...eblueprint.co.uk>,
	"'Mike Galbraith'" <umgwanakikbuti@...il.com>,
	"'Linux-PM'" <linux-pm@...r.kernel.org>,
	"'LKML'" <linux-kernel@...r.kernel.org>,
	"'Srinivas Pandruvada'" <srinivas.pandruvada@...ux.intel.com>
Subject: RE: [PATCH 1/1] intel_pstate: Increase hold-off time before busyness is scaled

Hi Steph,

On 2016.02.24 08:20 Stephane Gasparini wrote:
>> On Feb 19, 2016, at 5:38 PM, Doug Smythies <dsmythies@...us.net> wrote: 
>>> On 2016.02.19 03:12 Stephane Gasparini wrote:
>>> 
>>> The issue you are reporting looks like one we improved on android by using 
>>> the average pstate instead of using the last requested pstate
>>> 
>>> We know that this is improving the ffmpeg encoding performance when using the
>>> load algorithm.
>>> 
>>> see patch attached
>>> 
>>> This patch is only applied on get_target_pstate_use_cpu_load however you can give
>>> it a try on get_target_pstate_use_performance
>> 
>> Yes, that type of patch works on the load based approach.
>
> I’m not talking about using average p-state in the scaled_busy computation.
> I’m talking adding the output of the PID (the number of pstate to ad or subtract)
> to the average pstate rather than adding this to the current p-sate.

For the situation we are dealing with here, that would actually make it worse,
wouldn't it?

Let's work through a real very low load example from the Mel V2 patch where
the target pstate is increased whereas it should have been decreased:

Mel patch version 2 (12X hold off added to rjw 3 patch v10 set added to kernel 4.5-rc4):

CPU: 3
Core busy: 105
Scaled busy: 143
Old pstate: 25
New pstate: 34
mperf: 52039
aperf: 55097
tsc: 335265689
freq: 3599750 KHz
Load: 0.02%
Duration (mS): 98.293

New pstate = old pstate + (scaled_busy-setpoint) * p_gain
           = 25 + (143 - 97) * 0.2
           = 34 (as above)

Ave pstate = max_pstate * aperf / mperf
           = 34 * 55097 / 52039
           = 36

Steph average pstate method added to the above:
New pstate = ave pstate + (scaled_busy-setpoint) * p_gain
           = 36 + (143 - 97) * 0.2
           = 45 (before clamping)

Now, just for completeness show the no Mel patch math:
Scaled busy = Core busy * max_pstate / old pstate * sample time / duration
            = 105 * 34 / 25 * 10 / 98.293
            = 14.53
New pstate = old pstate + (scaled_busy-setpoint) * p_gain
            = 25 + (14.53 - 97) * .2
            = 8.5
            = 16 clamped minimum

Regardless, I coded the average pstate method and observe little
difference between it and the Mel V2 patch with limited testing.

... Doug


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ