[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2fcd9a10-ae9e-480f-87a1-5b49e5082ef5@linaro.org>
Date: Wed, 8 Jan 2025 10:15:34 +0100
From: Neil Armstrong <neil.armstrong@...aro.org>
To: Bjorn Andersson <andersson@...nel.org>
Cc: Konrad Dybcio <konradybcio@...nel.org>, Rob Herring <robh@...nel.org>,
Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley
<conor+dt@...nel.org>, linux-arm-msm@...r.kernel.org,
devicetree@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] arm64: dts: qcom: sm8650: setup cpu thermal with idle
on high temperatures
On 08/01/2025 04:11, Bjorn Andersson wrote:
> On Tue, Jan 07, 2025 at 09:13:18AM +0100, Neil Armstrong wrote:
>> Hi,
>>
>> On 07/01/2025 00:39, Bjorn Andersson wrote:
>>> On Fri, Jan 03, 2025 at 03:38:26PM +0100, Neil Armstrong wrote:
>>>> On the SM8650, the dynamic clock and voltage scaling (DCVS) is done in an
>>>> hardware controlled loop using the LMH and EPSS blocks with constraints and
>>>> OPPs programmed in the board firmware.
>>>>
>>>> Since the Hardware does a better job at maintaining the CPUs temperature
>>>> in an acceptable range by taking in account more parameters like the die
>>>> characteristics or other factory fused values, it makes no sense to try
>>>> and reproduce a similar set of constraints with the Linux cpufreq thermal
>>>> core.
>>>>
>>>> In addition, the tsens IP is responsible for monitoring the temperature
>>>> across the SoC and the current settings will heavily trigger the tsens
>>>> UP/LOW interrupts if the CPU temperatures reaches the hardware thermal
>>>> constraints which are currently defined in the DT. And since the CPUs
>>>> are not hooked in the thermal trip points, the potential interrupts and
>>>> calculations are a waste of system resources.
>>>>
>>>> Instead, set higher temperatures in the CPU trip points, and hook some CPU
>>>> idle injector with a 100% duty cycle at the highest trip point in the case
>>>> the hardware DCVS cannot handle the temperature surge, and try our best to
>>>> avoid reaching the critical temperature trip point which should trigger an
>>>> inevitable thermal shutdown.
>>>>
>>>
>>> Are you able to hit these higher temperatures? Do you have some test
>>> case where the idle-injection shows to be successful in blocking us from
>>> reaching the critical temp?
>>
>> No, I've been able to test idle-injection and observed a noticeable effect
>> but I had to set lower trip, do you know how I can easily "block" LMH/EPSS from
>> scaling down and let the temp go higher ?
>>
>
> I don't know how to override that configuration.
>
>>>
>>> E.g. in X13s (SC8280XP) we opted for relying on LMH/EPSS and define only
>>> the critical trip for when the hardware fails us.
>>
>> It's the goal here aswell
>>
>
> How about simplifying the patch by removing the idle-injection step and
> just rely on LMH/EPSS and the "critical" trip (at least until someone
> can prove that there's value in the extra mitigation)?
OK, but I see value in this idle injection mitigation in that case LMH/EPSS
fails, the only factor in control of HLOS is by stopping scheduling tasks
since frequency won't be able to scale anymore.
Anyway, I agree it can be added later on, so should I drop the 2 trip points
and only leave the critical one ?
>
> Regards,
> Bjorn
>
>>>
>>>
>>> I have no concerns at all about "removing" the 90C trip point, that
>>> makes total sense to me - let the hardware keep the cores as close to
>>> max as possible, and then use some slower sensor for keeping the system
>>> temperature in check (such as the x13s skin sensor).
>>>
>>>
>>> PS. The described behavior should apply to anything SDM845 and newer, so
>>> I'd like to see this set/document precedence for other platforms.
>>>
>>> Regards,
>>> Bjorn
>>>
>>>> Signed-off-by: Neil Armstrong <neil.armstrong@...aro.org>
>>>> ---
>>>> arch/arm64/boot/dts/qcom/sm8650.dtsi | 274 +++++++++++++++++++++++++++--------
>>>> 1 file changed, 214 insertions(+), 60 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
>>>> index 25e47505adcb790d09f1d2726386438487255824..448374a32e07151e35727d92fab77356769aea8a 100644
>>>> --- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
>>>> +++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
>>>> @@ -99,6 +99,13 @@ l3_0: l3-cache {
>>>> cache-unified;
>>>> };
>>>> };
>>>> +
>>>> + cpu0_idle: thermal-idle {
>>>> + #cooling-cells = <2>;
>>>> + duration-us = <800000>;
>>>> + exit-latency-us = <10000>;
>>>> + };
>>>> +
>>>> };
>>>> cpu1: cpu@100 {
>>>> @@ -119,6 +126,12 @@ cpu1: cpu@100 {
>>>> qcom,freq-domain = <&cpufreq_hw 0>;
>>>> #cooling-cells = <2>;
>>>> +
>>>> + cpu1_idle: thermal-idle {
>>>> + #cooling-cells = <2>;
>>>> + duration-us = <800000>;
>>>> + exit-latency-us = <10000>;
>>>> + };
>>>> };
>>>> cpu2: cpu@200 {
>>>> @@ -146,6 +159,12 @@ l2_200: l2-cache {
>>>> cache-unified;
>>>> next-level-cache = <&l3_0>;
>>>> };
>>>> +
>>>> + cpu2_idle: thermal-idle {
>>>> + #cooling-cells = <2>;
>>>> + duration-us = <800000>;
>>>> + exit-latency-us = <10000>;
>>>> + };
>>>> };
>>>> cpu3: cpu@300 {
>>>> @@ -166,6 +185,12 @@ cpu3: cpu@300 {
>>>> qcom,freq-domain = <&cpufreq_hw 3>;
>>>> #cooling-cells = <2>;
>>>> +
>>>> + cpu3_idle: thermal-idle {
>>>> + #cooling-cells = <2>;
>>>> + duration-us = <800000>;
>>>> + exit-latency-us = <10000>;
>>>> + };
>>>> };
>>>> cpu4: cpu@400 {
>>>> @@ -193,6 +218,12 @@ l2_400: l2-cache {
>>>> cache-unified;
>>>> next-level-cache = <&l3_0>;
>>>> };
>>>> +
>>>> + cpu4_idle: thermal-idle {
>>>> + #cooling-cells = <2>;
>>>> + duration-us = <800000>;
>>>> + exit-latency-us = <10000>;
>>>> + };
>>>> };
>>>> cpu5: cpu@500 {
>>>> @@ -220,6 +251,12 @@ l2_500: l2-cache {
>>>> cache-unified;
>>>> next-level-cache = <&l3_0>;
>>>> };
>>>> +
>>>> + cpu5_idle: thermal-idle {
>>>> + #cooling-cells = <2>;
>>>> + duration-us = <800000>;
>>>> + exit-latency-us = <10000>;
>>>> + };
>>>> };
>>>> cpu6: cpu@600 {
>>>> @@ -247,6 +284,12 @@ l2_600: l2-cache {
>>>> cache-unified;
>>>> next-level-cache = <&l3_0>;
>>>> };
>>>> +
>>>> + cpu6_idle: thermal-idle {
>>>> + #cooling-cells = <2>;
>>>> + duration-us = <800000>;
>>>> + exit-latency-us = <10000>;
>>>> + };
>>>> };
>>>> cpu7: cpu@700 {
>>>> @@ -274,6 +317,12 @@ l2_700: l2-cache {
>>>> cache-unified;
>>>> next-level-cache = <&l3_0>;
>>>> };
>>>> +
>>>> + cpu7_idle: thermal-idle {
>>>> + #cooling-cells = <2>;
>>>> + duration-us = <800000>;
>>>> + exit-latency-us = <10000>;
>>>> + };
>>>> };
>>>> cpu-map {
>>>> @@ -5752,23 +5801,30 @@ cpu2-top-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu2_top_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu2-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu2_top_alert1>;
>>>> + cooling-device = <&cpu2_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu2-bottom-thermal {
>>>> @@ -5776,23 +5832,30 @@ cpu2-bottom-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu2_bottom_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu2-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu2_bottom_alert1>;
>>>> + cooling-device = <&cpu2_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu3-top-thermal {
>>>> @@ -5800,23 +5863,30 @@ cpu3-top-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu3_top_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu3-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu3_top_alert1>;
>>>> + cooling-device = <&cpu3_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu3-bottom-thermal {
>>>> @@ -5824,23 +5894,30 @@ cpu3-bottom-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu3_bottom_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu3-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu3_bottom_alert1>;
>>>> + cooling-device = <&cpu3_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu4-top-thermal {
>>>> @@ -5848,23 +5925,30 @@ cpu4-top-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu4_top_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu4-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu4_top_alert1>;
>>>> + cooling-device = <&cpu4_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu4-bottom-thermal {
>>>> @@ -5872,23 +5956,30 @@ cpu4-bottom-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu4_bottom_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu4-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu4_bottom_alert1>;
>>>> + cooling-device = <&cpu4_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu5-top-thermal {
>>>> @@ -5896,23 +5987,30 @@ cpu5-top-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu5_top_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu5-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu5_top_alert1>;
>>>> + cooling-device = <&cpu5_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu5-bottom-thermal {
>>>> @@ -5920,23 +6018,30 @@ cpu5-bottom-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu5_bottom_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu5-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu5_bottom_alert1>;
>>>> + cooling-device = <&cpu5_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu6-top-thermal {
>>>> @@ -5944,23 +6049,30 @@ cpu6-top-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu6_top_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu6-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu6_top_alert1>;
>>>> + cooling-device = <&cpu6_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu6-bottom-thermal {
>>>> @@ -5968,23 +6080,30 @@ cpu6-bottom-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu6_bottom_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu6-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu6_bottom_alert1>;
>>>> + cooling-device = <&cpu6_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> aoss1-thermal {
>>>> @@ -6010,23 +6129,30 @@ cpu7-top-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu7_top_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu7-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu7_top_alert1>;
>>>> + cooling-device = <&cpu7_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu7-middle-thermal {
>>>> @@ -6034,23 +6160,30 @@ cpu7-middle-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu7_middle_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu7-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu7_middle_alert1>;
>>>> + cooling-device = <&cpu7_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu7-bottom-thermal {
>>>> @@ -6058,23 +6191,30 @@ cpu7-bottom-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu7_bottom_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu7-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu7_bottom_alert1>;
>>>> + cooling-device = <&cpu7_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu0-thermal {
>>>> @@ -6082,23 +6222,30 @@ cpu0-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu0_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu0-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu0_alert1>;
>>>> + cooling-device = <&cpu0_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> cpu1-thermal {
>>>> @@ -6106,23 +6253,30 @@ cpu1-thermal {
>>>> trips {
>>>> trip-point0 {
>>>> - temperature = <90000>;
>>>> + temperature = <108000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> - trip-point1 {
>>>> - temperature = <95000>;
>>>> + cpu1_alert1: trip-point1 {
>>>> + temperature = <110000>;
>>>> hysteresis = <2000>;
>>>> type = "passive";
>>>> };
>>>> cpu1-critical {
>>>> - temperature = <110000>;
>>>> + temperature = <115000>;
>>>> hysteresis = <1000>;
>>>> type = "critical";
>>>> };
>>>> };
>>>> +
>>>> + cooling-maps {
>>>> + map0 {
>>>> + trip = <&cpu1_alert1>;
>>>> + cooling-device = <&cpu1_idle 100 100>;
>>>> + };
>>>> + };
>>>> };
>>>> nsphvx0-thermal {
>>>>
>>>> --
>>>> 2.34.1
>>>>
>>
Powered by blists - more mailing lists