[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <124e8c2a984d3c2775264fae85cd41b7@manjaro.org>
Date: Tue, 20 Aug 2024 05:26:15 +0200
From: Dragan Simic <dsimic@...jaro.org>
To: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: Icenowy Zheng <uwu@...nowy.me>, linux-sunxi@...ts.linux.dev,
wens@...e.org, jernej.skrabec@...il.com, samuel@...lland.org,
linux-arm-kernel@...ts.infradead.org, devicetree@...r.kernel.org,
robh@...nel.org, krzk+dt@...nel.org, conor+dt@...nel.org,
linux-kernel@...r.kernel.org, wenst@...omium.org, broonie@...nel.org
Subject: Re: [PATCH] arm64: dts: allwinner: Add GPU thermal trips to the SoC
dtsi for A64
Hello Daniel,
On 2024-08-19 17:42, Daniel Lezcano wrote:
> On 12/08/2024 04:46, Dragan Simic wrote:
>> On 2024-08-12 04:40, Icenowy Zheng wrote:
>>> 在 2024-08-12星期一的 04:00 +0200,Dragan Simic写道:
>>>> Add thermal trips for the two GPU thermal sensors found in the
>>>> Allwinner A64.
>>>> There's only one GPU OPP defined since the commit 1428f0c19f9c
>>>> ("arm64: dts:
>>>> allwinner: a64: Run GPU at 432 MHz"), so defining only the critical
>>>> thermal
>>>> trips makes sense for the A64's two GPU thermal zones.
>>>>
>>>> Having these critical thermal trips defined ensures that no hot
>>>> spots
>>>> develop
>>>> inside the SoC die that exceed the maximum junction temperature.
>>>> That might
>>>> have been possible before, although quite unlikely, because the CPU
>>>> and GPU
>>>> portions of the SoC are packed closely inside the SoC, so the
>>>> overheating GPU
>>>> would inevitably result in the heat soaking into the CPU portion of
>>>> the SoC,
>>>> causing the CPU thermal sensor to return high readings and trigger
>>>> the CPU
>>>> critical thermal trips. However, it's better not to rely on the
>>>> heat
>>>> soak
>>>> and have the critical GPU thermal trips properly defined instead.
>>>>
>>>> While there, remove a few spotted comments that are rather
>>>> redundant,
>>>> because
>>>> it's pretty much obvious what units are used in those places.
>>>
>>> This should be another individual patch, I think.
>>
>> Perhaps, which I already thought about, but it might also be best
>> to simply drop the removal of those redundant comments entirely.
>> Let's also see what will other people say.
>>
>>>> Signed-off-by: Dragan Simic <dsimic@...jaro.org>
>>>> ---
>>>> arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 22
>>>> ++++++++++++++---
>>>> -- 1 file changed, 16 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>>>> b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>>>> index e868ca5ae753..bc5d3a2e6c98 100644
>>>> --- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>>>> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>>>> @@ -212,7 +212,6 @@ timer {
>>>>
>>>> thermal-zones {
>>>> cpu_thermal: cpu0-thermal {
>>>> - /* milliseconds */
>>>
>>> The unit of a 0 isn't not so obvious I think, so I suggest to keep
>>> this.
>>
>> Quite frankly, I think it should be obvious to anyone tackling
>> the thermal zones and trips.
>
> You can remove also polling-delay-passive and polling-passive when
> they are equal to zero. If they are absent they will be set to zero by
> default.
Good point, thanks! Though, I'd rather leave those "... = <0>;"
removals for a small follow-up series, because those changes touch
more actual code than just the comments, so it's better to keep them
as separate changes for easier bisection later, if it's ever needed.
Hopefully never. :)
I just made a note for myself to create and submit those follow-up
cleanup patches later, for all affected Allwinner and Rockchip SoC
dtsi files.
> That said, I take the opportunity to spot some inconsistency in this
> DT not related to this change.
>
> 1. There is a passive trip point and one cooling device mapped to it.
> With a polling-delay-passive=0, the mitigation will fail
Huh, how is the CPU throttling working then? Thanks for pointing it
out, I'll address this issue in the follow-up patches.
> 2. There is a second mapping for the hot trip point. That does not
> make sense, it is not possible because there is no mitigation for
> 'hot' and 'critical' trip points.
Yup, I see no special handling of tz->ops.hot, so having the hot trip
point makes no sense. Thanks again for pointing it out, I'll address
this issue in the follow-up patches as well.
I'll send the v2 soon, as a small patch series, and I'll send a few
follow-up patches later.
Powered by blists - more mailing lists