lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <173e6bab-9d21-eb28-9b91-a5f80c01fd03@linaro.org>
Date:   Thu, 24 Jun 2021 12:21:10 +0200
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Eric Woudstra <ericwouds@...il.com>
Cc:     Rob Herring <robh+dt@...nel.org>,
        Matthias Brugger <matthias.bgg@...il.com>,
        devicetree@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-mediatek@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Fix mt7622.dtsi thermal cpu

On 24/06/2021 11:59, Eric Woudstra wrote:
> 
> For Marvell:
> 
> https://www.google.com/url?sa=t&source=web&rct=j&url=https://wiki.kobol.io/helios4/files/som/brochure_a38x_microsom_2017-09-05.pdf
> 
> Armada38x maximum die temperature 115 degrees Celcius. They really get hotter then 100.
> 
> But for mt7622 I cannot find this value

Found that:

https://download.kamami.pl/p579344-MT7622A_Datasheet_for_BananaPi_Only%281%29.pdf

Chapter 3.3 - Thermal Characteristics

Given the values I suggest:

 - Passive - 80°C

 - Hot - 90°C

 - Critical - 100°C

And passive polling set to 250ms.

It sounds like the sensor is not supporting the interrupt mode yet, so a
big gap is needed with the Tj IMO to give the time to detect the trip
point crossing with the polling.

> ⁣Get BlueMail for Android ​
> 
> On Jun 23, 2021, 10:08 PM, at 10:08 PM, Daniel Lezcano <daniel.lezcano@...aro.org> wrote:
>> On 23/06/2021 20:43, Eric Woudstra wrote:
>>>
>>> I choose "hot" before, because 87 degrees seems ok to start frequency
>>> throttling. But, yes, it should be passive.
>>>
>>> 87 is still quite low if I compare this temperature with the
>>> wrt3200acm Marvell dual core arm soc. They even went above 100
>>> degrees so I feel for an arm processor inside a router box it is fine
>>> to use 87 degrees But maybe someone at Mediatek can give some more
>>> details about operating temperatures.
>>
>> Sometimes, the SoC vendor puts a high temperature in the DT just to
>> export the thermal zone and deal with it from userspace. So putting the
>> high temp allow the userspace (usually a thermal engine - Android
>> stuff)
>> to deal with the mitigation without a kernel interaction.
>>
>> Having more than 100°C could be this kind of setup. Only the operating
>> temperature from the hardware documentation will tell the safe
>> temperature for the silicon.
>>
>> IMO, 77°C is a good compromise until getting the documented temp. 87°C
>> sounds to me a bit too hot.
>>
>>> It may be possible to leave the active map in the device tree as some
>>> users of the bananapi might choose to install a fan as it is one of
>>> the options.
>>
>> The active trip only makes sense if the cooling device is a fan (or any
>> active device), so the mapping points to a fan node, like:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git/tree/arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi#n192
>>
>> If there is no such [pwm] fan output on the board, no active trip point
>> should be added.
>>
>>> ⁣Get BlueMail for Android ​
>>>
>>> On Jun 23, 2021, 5:58 PM, at 5:58 PM, Daniel Lezcano
>>> <daniel.lezcano@...aro.org> wrote:
>>>> On 23/06/2021 17:35, Eric Woudstra wrote:
>>>>> It is only useful to set 1 map with the regulated temperature for
>>>>> cpu frequency throttling. Same as in the kernel document
>>>>> example.
>>>>>
>>>>>
>>>>> It has no use to set frequency scaling on 2 different
>>>>> temperature trip points, as the lowest one makes sure the higher
>>>>> one(s) are never reached.
>>>>
>>>> I looked more closely the DT and there is a misunderstanding of
>>>> the thermal framework in the definition.
>>>>
>>>> There is one trip point with the passive type and the cpu cooling 
>>>> device, followed by a second trip point with the active type *but*
>>>> the same cpu cooling device. That is wrong.
>>>>
>>>> And finally, there is the hot trip point as a third mapping and
>>>> the same cooling device.
>>>>
>>>> The hot trip point is only there to notify userspace and let it
>>>> take an immediate action to prevent an emergency shutdown when
>>>> reaching the critical temperature.
>>>>
>>>>> It can be applied only at 1 trip point. Multiple trip points is
>>>>> only usefully for fan control to make sure the fan is not too 
>>>>> noisy when it is not necessary to be noisy.
>>>>>
>>>>>
>>>>> The CPU will almost come to a dead stop when it starts to pass
>>>>> the lowest thermal map with frequency throttling.
>>>>>
>>>>> This is why it is a bug and needs a fix, not only adjustment.
>>>>
>>>> Yes, you are right. It should be something like (verbatim copy):
>>>>
>>>> diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi 
>>>> b/arch/arm64/boot/dts/mediatek/mt7622.dtsi index
>>>> 890a942ec608..88c81d24f4ff 100644 ---
>>>> a/arch/arm64/boot/dts/mediatek/mt7622.dtsi +++
>>>> b/arch/arm64/boot/dts/mediatek/mt7622.dtsi @@ -136,24 +136,18 @@
>>>> secmon_reserved: secmon@...00000 {
>>>>
>>>> thermal-zones { cpu_thermal: cpu-thermal { -
>>>> polling-delay-passive = <1000>; +			polling-delay-passive = <250>; 
>>>> polling-delay = <1000>;
>>>>
>>>> thermal-sensors = <&thermal 0>;
>>>>
>>>> trips { cpu_passive: cpu-passive { -					temperature = <47000>; +
>>>> temperature = <77000>; hysteresis = <2000>; type = "passive"; };
>>>>
>>>> -				cpu_active: cpu-active { -					temperature = <67000>; -
>>>> hysteresis = <2000>; -					type = "active"; -				}; - cpu_hot:
>>>> cpu-hot { temperature = <87000>; hysteresis = <2000>; @@ -173,18
>>>> +167,6 @@ map0 { cooling-device = <&cpu0 THERMAL_NO_LIMIT
>>>> THERMAL_NO_LIMIT>, <&cpu1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; }; - 
>>>> -				map1 { -					trip = <&cpu_active>; -					cooling-device =
>>>> <&cpu0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, -							 <&cpu1
>>>> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; -				}; - -				map2 { -
>>>> trip = <&cpu_hot>; -					cooling-device = <&cpu0 THERMAL_NO_LIMIT
>>>> THERMAL_NO_LIMIT>, -							 <&cpu1 THERMAL_NO_LIMIT
>>>> THERMAL_NO_LIMIT>; -				}; }; }; };
>>>>
>>>>
>>>> -- <http://www.linaro.org/> Linaro.org │ Open source software for
>>>> ARM SoCs
>>>>
>>>> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook | 
>>>> <http://twitter.com/#!/linaroorg> Twitter | 
>>>> <http://www.linaro.org/linaro-blog/> Blog
>>>
>>
>>
>> -- 
>> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>>
>> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
>> <http://twitter.com/#!/linaroorg> Twitter |
>> <http://www.linaro.org/linaro-blog/> Blog
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ