lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <124e8c2a984d3c2775264fae85cd41b7@manjaro.org>
Date: Tue, 20 Aug 2024 05:26:15 +0200
From: Dragan Simic <dsimic@...jaro.org>
To: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: Icenowy Zheng <uwu@...nowy.me>, linux-sunxi@...ts.linux.dev,
 wens@...e.org, jernej.skrabec@...il.com, samuel@...lland.org,
 linux-arm-kernel@...ts.infradead.org, devicetree@...r.kernel.org,
 robh@...nel.org, krzk+dt@...nel.org, conor+dt@...nel.org,
 linux-kernel@...r.kernel.org, wenst@...omium.org, broonie@...nel.org
Subject: Re: [PATCH] arm64: dts: allwinner: Add GPU thermal trips to the SoC
 dtsi for A64

Hello Daniel,

On 2024-08-19 17:42, Daniel Lezcano wrote:
> On 12/08/2024 04:46, Dragan Simic wrote:
>> On 2024-08-12 04:40, Icenowy Zheng wrote:
>>> 在 2024-08-12星期一的 04:00 +0200,Dragan Simic写道:
>>>> Add thermal trips for the two GPU thermal sensors found in the
>>>> Allwinner A64.
>>>> There's only one GPU OPP defined since the commit 1428f0c19f9c
>>>> ("arm64: dts:
>>>> allwinner: a64: Run GPU at 432 MHz"), so defining only the critical
>>>> thermal
>>>> trips makes sense for the A64's two GPU thermal zones.
>>>> 
>>>> Having these critical thermal trips defined ensures that no hot 
>>>> spots
>>>> develop
>>>> inside the SoC die that exceed the maximum junction temperature.
>>>> That might
>>>> have been possible before, although quite unlikely, because the CPU
>>>> and GPU
>>>> portions of the SoC are packed closely inside the SoC, so the
>>>> overheating GPU
>>>> would inevitably result in the heat soaking into the CPU portion of
>>>> the SoC,
>>>> causing the CPU thermal sensor to return high readings and trigger
>>>> the CPU
>>>> critical thermal trips.  However, it's better not to rely on the 
>>>> heat
>>>> soak
>>>> and have the critical GPU thermal trips properly defined instead.
>>>> 
>>>> While there, remove a few spotted comments that are rather 
>>>> redundant,
>>>> because
>>>> it's pretty much obvious what units are used in those places.
>>> 
>>> This should be another individual patch, I think.
>> 
>> Perhaps, which I already thought about, but it might also be best
>> to simply drop the removal of those redundant comments entirely.
>> Let's also see what will other people say.
>> 
>>>> Signed-off-by: Dragan Simic <dsimic@...jaro.org>
>>>> ---
>>>>  arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 22 
>>>> ++++++++++++++---
>>>> --  1 file changed, 16 insertions(+), 6 deletions(-)
>>>> 
>>>> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>>>> b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>>>> index e868ca5ae753..bc5d3a2e6c98 100644
>>>> --- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>>>> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>>>> @@ -212,7 +212,6 @@ timer {
>>>> 
>>>>         thermal-zones {
>>>>                 cpu_thermal: cpu0-thermal {
>>>> -                       /* milliseconds */
>>> 
>>> The unit of a 0 isn't not so obvious I think, so I suggest to keep
>>> this.
>> 
>> Quite frankly, I think it should be obvious to anyone tackling
>> the thermal zones and trips.
> 
> You can remove also polling-delay-passive and  polling-passive when
> they are equal to zero. If they are absent they will be set to zero by
> default.

Good point, thanks!  Though, I'd rather leave those "... = <0>;"
removals for a small follow-up series, because those changes touch
more actual code than just the comments, so it's better to keep them
as separate changes for easier bisection later, if it's ever needed.
Hopefully never. :)

I just made a note for myself to create and submit those follow-up
cleanup patches later, for all affected Allwinner and Rockchip SoC
dtsi files.

> That said, I take the opportunity to spot some inconsistency in this
> DT not related to this change.
> 
> 1. There is a passive trip point and one cooling device mapped to it.
> With a polling-delay-passive=0, the mitigation will fail

Huh, how is the CPU throttling working then?  Thanks for pointing it
out, I'll address this issue in the follow-up patches.

> 2. There is a second mapping for the hot trip point. That does not
> make sense, it is not possible because there is no mitigation for
> 'hot' and 'critical' trip points.

Yup, I see no special handling of tz->ops.hot, so having the hot trip
point makes no sense.  Thanks again for pointing it out, I'll address
this issue in the follow-up patches as well.

I'll send the v2 soon, as a small patch series, and I'll send a few
follow-up patches later.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ