lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 9 Aug 2019 18:37:46 +0100
From:   Douglas Raillard <douglas.raillard@....com>
To:     Patrick Bellasi <patrick.bellasi@....com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        mingo@...hat.com, rjw@...ysocki.net, viresh.kumar@...aro.org,
        quentin.perret@....com, dietmar.eggemann@....com
Subject: Re: [RFC PATCH v2 0/5] sched/cpufreq: Make schedutil energy aware

Hi Patrick,

On 7/9/19 11:37 AM, Patrick Bellasi wrote:
> On 08-Jul 14:46, Douglas Raillard wrote:
>> Hi Patrick,
>>
>> On 7/8/19 12:09 PM, Patrick Bellasi wrote:
>>> On 03-Jul 17:36, Douglas Raillard wrote:
>>>> On 7/2/19 4:51 PM, Peter Zijlstra wrote:
>>>>> On Thu, Jun 27, 2019 at 06:15:58PM +0100, Douglas RAILLARD wrote:
> 
> [...]
> 
>>> You are also correct in pointing out that in the steady state
>>> ramp_boost will not be triggered in that steady state.
>>>
>>> IMU, that's for two main reasons:
>>>    a) it's very likely that enqueued <= util_avg
>>>    b) even in case enqueued should turn out to be _slightly_ bigger then
>>>       util_avg, the corresponding (proportional) ramp_boost would be so
>>>       tiny to not have any noticeable effect on OPP selection.
>>>
>>> Am I correct on point b) above?
>>
>> Assuming you meant "util_avg slightly bigger than enqueued" (which is when boosting triggers),
>> then yes since ramp_boost effect is proportional to "task_ue.enqueue - task_u". It makes it robust
>> against that.
> 
> Right :)
> 
>>> Could you maybe come up with some experimental numbers related to that
>>> case specifically?
>>
>> With:
>> * an rt-app task ramping up from 5% to 75% util in one big step. The
>> whole cycle is 0.6s long (0.3s at 5% followed by 0.3s at 75%). This
>> cycle is repeated 20 times and the average of boosting is taken.
>>
>> * a hikey 960 (this impact the frequency at which the test runs at
>> the beginning of 75% phase, which impacts the number of missed
>> activations before the util ramped up).
>>
>> * assuming an OPP exists for each util value (i.e. 1024 OPPs, so the
>> effect of boost on consumption is not impacted by OPP capacities
>> granularity)
>>
>> Then the boosting feature would increase the average power
>> consumption by 3.1%, out of which 0.12% can be considered "spurious
>> boosting" due to the util taking some time to really converge to its
>> steady state value.
>>
>> In practice, the impact of small boosts will be even lower since
>> they will less likely trigger the selection of a high OPP due to OPP
>> capacity granularity > 1 util unit.
> 
> That's ok for the energy side: you estimate a ~3% worst case more
> energy on that specific target.
> 
> By boosting I expect the negative boost to improve.
> Do you have also numbers/stats related to the negative slack?
> Can you share a percentage figure for that improvement?

I'm now testing on a Google Pixel 3 (Qcom Snapdragon 845) phone, with the same workload, pinned on a big core.
It has a lot more OPPs than a hikey 960, so gradations in boosting are better reflected on frequency selection.

avg slack (higher=better):
     Average time between task sleep and its next periodic activation.

avg negative slack (lower in absolute value=better):
     Same as avg slack, but only taking into account negative values.
     Negative slack means a task activation did not have enough time to complete before the next
     periodic activation fired, which is what we want to avoid.

boost energy overhead (lower=better):
     Extra power consumption induced by ramp boost, assuming continuous OPP space (infinite number of OPP)
     and single-CPU policies. In practice, fixed number of OPP decrease this value, and more CPU per policy increases it,
     since boost(policy) = max(boost(cpu of policy)).

Without ramp boost:
+--------------------+--------------------+
|avg slack (us)      |avg negative slack  |
|                    |(us)                |
+--------------------+--------------------+
|6598.72             |-10217.13           |
|6595.49             |-10200.13           |
|6613.72             |-10401.06           |
|6600.29             |-9860.872           |
|6605.53             |-10057.64           |
|6612.05             |-10267.50           |
|6599.01             |-9939.60            |
|6593.79             |-9445.633           |
|6613.56             |-10276.75           |
|6595.44             |-9751.770           |
+--------------------+--------------------+
|average                                  |
+--------------------+--------------------+
|6602.76             |-10041.81           |
+--------------------+--------------------+


With ramp boost enabled:
+--------------------+--------------------+--------------------+
|boost energy        |avg slack (us)      |avg negative slack  |
|overhead (%)        |                    |(us)                |
+--------------------+--------------------+--------------------+
|3.05                |7148.93             |-5664.26            |
|3.04                |7144.69             |-5667.77            |
|3.05                |7149.05             |-5698.31            |
|2.97                |7126.71             |-6040.23            |
|3.02                |7140.28             |-5826.78            |
|3.03                |7135.11             |-5749.62            |
|3.05                |7140.24             |-5750.0             |
|3.05                |7144.84             |-5667.04            |
|3.07                |7157.30             |-5656.65            |
|3.06                |7154.65             |-5653.76            |
+--------------------+--------------------+--------------------+
|average                                                       |
+--------------------+--------------------+--------------------+
|3.039000            |7144.18             |5737.44             |
+--------------------+--------------------+--------------------+


The negative slack is due to missed activations while the utilization signals
increase during the big utilization step. Ramp boost is designed to boost frequency during
that phase, which materializes in 1.75 less negative slack, for an extra power
consumption under 3%.

> Best,
> Patrick
> 

Thanks,
Douglas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ