lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01e42e08965e58a337b9b531c10446fd@manjaro.org>
Date: Fri, 11 Oct 2024 11:04:38 +0200
From: Dragan Simic <dsimic@...jaro.org>
To: Jonas Karlman <jonas@...boo.se>
Cc: heiko@...ech.de, linux-rockchip@...ts.infradead.org,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 devicetree@...r.kernel.org, robh@...nel.org, krzk+dt@...nel.org,
 conor+dt@...nel.org, stable@...r.kernel.org
Subject: Re: [PATCH] arm64: dts: rockchip: Prevent thermal runaways in RK3308
 SoC dtsi

Hello Jonas,

On 2024-10-11 10:52, Jonas Karlman wrote:
> On 2024-10-10 12:19, Dragan Simic wrote:
>> Until the TSADC, thermal zones, thermal trips and cooling maps are 
>> defined
>> in the RK3308 SoC dtsi, none of the CPU OPPs except the slowest one 
>> may be
>> enabled under any circumstances.  Allowing the DVFS to scale the CPU 
>> cores
>> up without even just the critical CPU thermal trip in place can rather 
>> easily
>> result in thermal runaways and damaged SoCs, which is bad.
>> 
>> Thus, leave only the lowest available CPU OPP enabled for now.
> 
> This feel like a very aggressive limitation, to only allow the
> opp-suspend rate, that is not even used under normal load.
> 
> I let my Rock Pi S board with a RK3308B variant run "stress -c 8" for
> around 10 hours and the reported temp only reach around 50-55 deg c,
> ambient temp around 20 deg c and board laying flat on a table without
> any enclosure or heat sink.
> 
> This was running with performance as scaling_governor and cpu running
> the 1008000 opp.

Thanks for testing all that!  That's very low CPU temperature under
stress testing indeed.  Maybe the cooling gets worse and the CPU
temperature goes higher if the board is installed into some small
enclosure with no natural or forced airflow?

> Most RK3308 variants datasheets list 1.3 GHz as max rate for CPU,
> the K-variant lists 1.2 GHz, and the -S-variants seem to have both
> reduced voltage and max rate.
> 
> The OPPs for this SoC already limits max rate to 1 GHz and is more than
> likely good enough to not reach the max temperature of 115-125 deg c as
> rated in datasheets and vendor DTs.
> 
> Adding the tsadc and trips (same/similar as px30) will probably allow 
> us
> to add/use the "missing" 1.2 and 1.3 GHz OPPs.

With these insights, I agree that the patch might have been a bit
too extreme, but it also promotes good practices when it comes to
upstreaming.  The general rule is not to add CPU or GPU OPPs with
no proper thermal configuration already in place.

The patch has already been merged, and as I already noted, [1] I'll
try to implement, test and submit the proper thermal configuration
ASAP.  It's up Heiko to decide whether to drop this patch or not.

[1] 
https://lore.kernel.org/linux-rockchip/df92710498f66bcb4580cb2cd1573fb2@manjaro.org/

>> Fixes: 6913c45239fd ("arm64: dts: rockchip: Add core dts for RK3308 
>> SOC")
>> Cc: stable@...r.kernel.org
>> Signed-off-by: Dragan Simic <dsimic@...jaro.org>
>> ---
>>  arch/arm64/boot/dts/rockchip/rk3308.dtsi | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/arch/arm64/boot/dts/rockchip/rk3308.dtsi 
>> b/arch/arm64/boot/dts/rockchip/rk3308.dtsi
>> index 31c25de2d689..a7698e1f6b9e 100644
>> --- a/arch/arm64/boot/dts/rockchip/rk3308.dtsi
>> +++ b/arch/arm64/boot/dts/rockchip/rk3308.dtsi
>> @@ -120,16 +120,19 @@ opp-600000000 {
>>  			opp-hz = /bits/ 64 <600000000>;
>>  			opp-microvolt = <950000 950000 1340000>;
>>  			clock-latency-ns = <40000>;
>> +			status = "disabled";
>>  		};
>>  		opp-816000000 {
>>  			opp-hz = /bits/ 64 <816000000>;
>>  			opp-microvolt = <1025000 1025000 1340000>;
>>  			clock-latency-ns = <40000>;
>> +			status = "disabled";
>>  		};
>>  		opp-1008000000 {
>>  			opp-hz = /bits/ 64 <1008000000>;
>>  			opp-microvolt = <1125000 1125000 1340000>;
>>  			clock-latency-ns = <40000>;
>> +			status = "disabled";
>>  		};
>>  	};

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ