linux-kernel - Re: [PATCH 5/5] powercap/drivers/dtpm: Scale the power with the load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c863dad9-66d5-77b1-c1e2-53364dcbc805@linaro.org>
Date:   Tue, 9 Mar 2021 20:22:27 +0100
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Lukasz Luba <lukasz.luba@....com>
Cc:     rafael@...nel.org, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org
Subject: Re: [PATCH 5/5] powercap/drivers/dtpm: Scale the power with the load

On 09/03/2021 11:01, Lukasz Luba wrote:
> Hi Daniel,
> 
> I've started reviewing the series, please find some comments below.
> 
> On 3/1/21 9:21 PM, Daniel Lezcano wrote:
>> Currently the power consumption is based on the current OPP power
>> assuming the entire performance domain is fully loaded.
>>
>> That gives very gross power estimation and we can do much better by
>> using the load to scale the power consumption.
>>
>> Use the utilization to normalize and scale the power usage over the
>> max possible power.
>>
>> Tested on a rock960 with 2 big CPUS, the power consumption estimation
>> conforms with the expected one.
>>
>> Before this change:
>>
>> ~$ ~/dhrystone -t 1 -l 10000&
>> ~$ cat
>> /sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
>>
>> 2260000
>>
>> After this change:
>>
>> ~$ ~/dhrystone -t 1 -l 10000&
>> ~$ cat
>> /sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
>>
>> 1130000
>>
>> ~$ ~/dhrystone -t 2 -l 10000&
>> ~$ cat
>> /sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
>>
>> 2260000
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@...aro.org>
>> ---
>>   drivers/powercap/dtpm_cpu.c | 21 +++++++++++++++++----
>>   1 file changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
>> index e728ebd6d0ca..8379b96468ef 100644
>> --- a/drivers/powercap/dtpm_cpu.c
>> +++ b/drivers/powercap/dtpm_cpu.c
>> @@ -68,27 +68,40 @@ static u64 set_pd_power_limit(struct dtpm *dtpm,
>> u64 power_limit)
>>       return power_limit;
>>   }
>>   +static u64 scale_pd_power_uw(struct cpumask *cpus, u64 power)
> 
> renamed 'cpus' into 'pd_mask', see below
> 
>> +{
>> +    unsigned long max, util;
>> +    int cpu, load = 0;
> 
> IMHO 'int load' looks odd when used with 'util' and 'max'.
> I would put in the line above to have them all the same type and
> renamed to 'sum_util'.
> 
>> +
>> +    for_each_cpu(cpu, cpus) {
> 
> I would avoid the temporary CPU mask in the get_pd_power_uw()
> with this modified loop:
> 
> for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
> 
> 
>> +        max = arch_scale_cpu_capacity(cpu);
>> +        util = sched_cpu_util(cpu, max);
>> +        load += ((util * 100) / max);
> 
> Below you can find 3 optimizations. Since we are not in the hot
> path here, it's up to if you would like to use all/some of them
> or just ignore.
> 
> 1st optimization.
> If we use 'load += (util << 10) / max' in the loop, then
> we could avoid div by 100 and use a right shift:
> (power * load) >> 10
> 
> 2nd optimization.
> Since we use EM CPU mask, which span all CPUs with the same
> arch_scale_cpu_capacity(), you can avoid N divs inside the loop
> and do it once, below the loop.
> 
> 3rd optimization.
> If we just simply add all 'util' into 'sum_util' (no mul or div in
> the loop), then we might just have simple macro
> 
> #define CALC_POWER_USAGE(power, sum_util, max) \
>     (((power * (sum_util << 10)) / max) >> 10)

static u64 scale_pd_power_uw(struct cpumask *pd_mask, u64 power)
{
        unsigned long max, sum_max = 0, sum_util = 0;
        int cpu;

        for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
                max = arch_scale_cpu_capacity(cpu);
                sum_util += sched_cpu_util(cpu, max);
                sum_max += max;
        }

        return (power * ((sum_util << 10) / sum_max)) >> 10;
}

??

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog