linux-kernel - Re: [PATCH v3 4/5] cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e4907877-6cfe-57fe-74b4-6d4efeb1d25a@arm.com>
Date:   Tue, 9 Nov 2021 08:46:41 +0000
From:   Lukasz Luba <lukasz.luba@....com>
To:     Thara Gopinath <thara.gopinath@...aro.org>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        linux-arm-msm@...r.kernel.org, sudeep.holla@....com,
        will@...nel.org, catalin.marinas@....com, linux@...linux.org.uk,
        gregkh@...uxfoundation.org, rafael@...nel.org,
        viresh.kumar@...aro.org, amitk@...nel.org,
        daniel.lezcano@...aro.org, amit.kachhap@...il.com,
        bjorn.andersson@...aro.org, agross@...nel.org,
        Steev Klimaszewski <steev@...i.org>
Subject: Re: [PATCH v3 4/5] cpufreq: qcom-cpufreq-hw: Use new thermal pressure
 update function



On 11/8/21 9:23 PM, Thara Gopinath wrote:
> 
> 
> On 11/8/21 9:12 AM, Lukasz Luba wrote:
> ...snip
> 
>>>
>>>
>>
>> Well, I think the issue is broader. Look at the code which
>> calculate this 'capacity'. It's just a multiplication & division:
>>
>> max_capacity = arch_scale_cpu_capacity(cpu); // =1024 in our case
>> capacity = mult_frac(max_capacity, throttled_freq,
>>          policy->cpuinfo.max_freq);
>>
>> In the reported by Steev output from sysfs cpufreq we know
>> that the value of 'policy->cpuinfo.max_freq' is:
>> /sys/devices/system/cpu/cpu5/cpufreq/cpuinfo_max_freq:2956800
>>
>> so when we put the values to the equation we get:
>> capacity = 1024 * 2956800 / 2956800; // =1024
>> The 'capacity' will be always <= 1024 and this check won't
>> be triggered:
>>
>> /* Don't pass boost capacity to scheduler */
>> if (capacity > max_capacity)
>>      capacity = max_capacity;
>>
>>
>> IIUC you original code, you don't want to have this boost
>> frequency to be treated as 1024 capacity. The reason is because
>> the whole capacity machinery in arch_topology.c is calculated based
>> on max freq value = 2841600,
>> so the max capacity 1024 would be pinned to that frequency
>> (according to Steeve's log:
>> [   22.552273] THERMAL_PRESSURE: max_freq(2841) < capped_freq(2956) 
>> for CPUs [4-7] )
> 
> Hi Lukasz,
> 
> Yes you are right in that I was using policy->cpuinfo.max_freq where as 
> I should have used freq_factor. So now that you are using freq_factor, 
> it makes sense to cap the capacity at the max capacity calulated by the 
> scheduler.
> 
> I agree that the problem is complex because at some point we should look 
> at rebuilding the topology based on changes to policy->cpuinfo.max_freq.
> 

I probably cannot fix your driver easily right now. What I can do and is
actually required for this new API arch_update_thermal_pressure() is to
accept boost frequencies (values which are higher that 'freq_factor')
without triggering a warning and just setting the thermal pressure to 0
(since we are told that the frequency capping is completely removed even
for boost values).

The next step would be to perform longer investigation how the boost
frequencies are accepted then triggered/used by scheduler and other
involved machinery.

I've asked Steev for help with setting up this Rockchip RK3399 new boost
frequency which actually is used. I want to understand why that platform
is able to use the boost freq and this Qcom SoC is not able to use it.

I agree with you that at some point we might need to try rebuilding the
topology information based on these policy->cpuinfo.max_freq changes.

I hope it would take only a few steps to fix these issues completely,
without destroying a lot of existing code...

Regards,
Lukasz