[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <trinity-fc0082c7-cf37-4458-ac72-3c5774702f5d-1694625368681@3c-app-gmx-bs38>
Date: Wed, 13 Sep 2023 19:16:08 +0200
From: Frank Wunderlich <frank-w@...lic-files.de>
To: AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>
Cc: Frank Wunderlich <linux@...web.de>,
linux-mediatek@...ts.infradead.org,
"Rafael J. Wysocki" <rafael@...nel.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Amit Kucheria <amitk@...nel.org>,
Zhang Rui <rui.zhang@...el.com>,
Rob Herring <robh+dt@...nel.org>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
Conor Dooley <conor+dt@...nel.org>,
Matthias Brugger <matthias.bgg@...il.com>,
linux-pm@...r.kernel.org, devicetree@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
Daniel Golle <daniel@...rotopia.org>
Subject: Aw: Re: [RFC v1 3/3] thermal/drivers/mediatek/lvts_thermal: add
mt7988 support
Hi,
> Gesendet: Mittwoch, 13. September 2023 um 13:43 Uhr
> Von: "AngeloGioacchino Del Regno" <angelogioacchino.delregno@...labora.com>
> An: frank-w@...lic-files.de, "Frank Wunderlich" <linux@...web.de>, linux-mediatek@...ts.infradead.org
> Il 13/09/23 12:52, Frank Wunderlich ha scritto:
> > Am 13. September 2023 10:16:51 MESZ schrieb AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>:
> > Hi angelo,
> >
> > thanks for first look
> >
> >> Il 11/09/23 20:33, Frank Wunderlich ha scritto:
> >>> From: Frank Wunderlich <frank-w@...lic-files.de>
> >>>
> >>> Add Support for mediatek fologic 880/MT7988.
> >>>
> >>> Signed-off-by: Frank Wunderlich <frank-w@...lic-files.de>
> >>> ---
> >>> drivers/thermal/mediatek/lvts_thermal.c | 73 +++++++++++++++++++++++++
> >>> 1 file changed, 73 insertions(+)
> >>>
> >>> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
> >>> index c1004b4da3b6..48b257a3c80e 100644
> >>> --- a/drivers/thermal/mediatek/lvts_thermal.c
> >>> +++ b/drivers/thermal/mediatek/lvts_thermal.c
> >>> @@ -82,6 +82,8 @@
> >>> #define LVTS_GOLDEN_TEMP_DEFAULT 50
> >>> #define LVTS_COEFF_A_MT8195 -250460
> >>> #define LVTS_COEFF_B_MT8195 250460
> >>> +#define LVTS_COEFF_A_MT7988 -204650
> >>> +#define LVTS_COEFF_B_MT7988 204650
> >>> #define LVTS_MSR_IMMEDIATE_MODE 0
> >>> #define LVTS_MSR_FILTERED_MODE 1
> >>> @@ -1272,6 +1274,67 @@ static int lvts_remove(struct platform_device *pdev)
> >>> return 0;
> >>> }
> >>> +/*
> >>> + * LVTS MT7988
> >>> + */
> >>> +#define LVTS_HW_SHUTDOWN_MT7988 117000
> >>
> >> Are you sure that this chip's Tj is >117°C ?!
> >>
> >> Looks a bit high... if it is exactly 117°C, I would suggest cutting earlier,
> >> either at 110 (safe side) or 115: after all, this is a life-saver feature and
> >> the chip is actually never meant to *constantly* work at 110°C (as it would
> >> degrade fast and say goodbye earlier than "planned").
> >
> > I took values from SDK
> >
> > https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/refs/heads/master/target/linux/mediatek/files-5.4/drivers/thermal/mediatek/soc_temp_lvts.c#1483
> >
>
> That kernel also defines 117°C for MT8195, which leaves me a bit dubious.
>
> For safety, I would recommend using the same 105°C hw shutdown temperature
> for 7988 as well: after all if you think about it, reaching that kind of
> temperature means that there's a real problem (fan stopped working and/or
> heatsink detached) that needs attention.
ack, will change to same value and put the define on top, abve the mt8195
> This is because you'll have trip points in devicetree that should throttle
> the CPU in case it reaches at least a dangerously high temperature (read:
> a temperature outside the recommended continuous operation range), bringing
> the temperatures down because of the throttling action; I would imagine
> throttling the CPU a bit down at 80°C, then a bit more at 90°C - but then,
> if the temps won't drop like that, there's clearly a HW-related issue that
> must be addressed (like the fan/heatsink scenario that I just described).
>
> Though, take this as an advice; I'll respect whichever decision you'll take.
>
> >>> +//enum mt7988_lvts_domain { MT7988_AP_DOMAIN, MT7988_NUM_DOMAIN };
> >>> +
> >>> +enum mt7988_lvts_sensor_enum {
> >>> + MT7988_TS3_0,
> >>> + MT7988_TS3_1,
> >>> + MT7988_TS3_2,
> >>> + MT7988_TS3_3,
> >>> + MT7988_TS4_0,
> >>> + MT7988_TS4_1,
> >>> + MT7988_TS4_2,
> >>> + MT7988_TS4_3,
> >>> + MT7988_NUM_TS
> >>> +};
> >
> >> This enumeration should be definitions in bindings (mediatek,lvts-thermal.h).
> >>
> >> Besides, the LVTS is about internal temperatures, so those TS3_x and 4_x can
> >> be renamed like what was done for MT8192 and MT8195: this is because you will
> >> never see TS3_2 being CPU2 on a board and CPU4 on another, being those - again -
> >> internal to the SoC, hence unchangeable.
> >
> > Right these sensors are internally only and i took naming from sdk to avoid confusion. And i have not more information about these internal sensors (special meaning),but their values are packed together to get the resulting (average) temperature.
> >
> >> Another reason is that you'll anyway have to refer to those sensors in the
> >> devicetree to configure thermal trips and such, so... :-)
> >
> > In device tree it will look like this:
> >
> > https://github.com/frank-w/BPI-Router-Linux/blob/6.5-lvts/arch/arm64/boot/dts/mediatek/mt7988a.dtsi#L771
> >
> > Daniel has also defined thermal trips there,but these are untested atm. I only verified temperature itself i get from sysfs as far as i can (start at ~40°C and reaching ~70 while running).
> >
>
> Check how it's done in mt8192.dtsi and mt8195.dtsi: we're using the definitions
> from the bindings for thermal zones.
> At least for consistency (apart from other even more valid considerations), that's
> how it should be done: please do it like that.
>
> Besides, I think that you can definitely guess what `TS` is CPU related: checking
> the driver and devicetree from the downstream kernel that you provided, what I
> understand is:
>
> 1. Your naming TS3,4 corresponds to TS2,3 downstream (so it's wrong?)
> 2. Downstream TS2 is related to the CPU cores, so it should be
> - TS2_0 - CPU0
> - TS2_1 - CPU1
> - TS2_2 - CPU2
> - TS2_3 - CPU3
i took the names from efuse names (which are listed as comment above), had 2/3 before.
need to ask mtk here. same for the second controller and if it is really a "lvts-ap".
> The other set of thermal sensors seem to be completely unused, so we cannot guess
> just by looking at the code. Since you have the hardware, you may be able to take
> a position about what they are by checking what sensor heats up for each "action"
> (could be ethernet controller or infra or whatever else); if in doubt, just leave
> them out, but add a comment saying that there are more sensors and say where, so
> that if anyone finds what they're for, it'll be easy to add them.
>
> In any case, adding thermal sensors that you don't know about is meaningless, as
> the sense of all this is:
> - Monitoring temperatures of the hardware
> - Taking action to prevent temperature overrun
> but if you don't know what you're reading, you can't interpret temperatures for
> something unknown, hence you can't as well take action to prevent overruns, as
> you won't know what's the maximum temperature for "unknown thing" :-)
yes i have only the downstream kernel and some additional information like interupt
number and the efuse offsets for calibration. And for the 8 sensors i get this information:
"8 sensors are distributed across the whole soc in order to get the correct average temperature.
chip designer didn’t disclose more detailed info to us."
so i thought all 8 sensors are taken into account with both controllers to calculate 1
resulting temperature. In downstream i saw also only 1 controller in dts and i have
only 1 interrupt.
https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/refs/heads/master/target/linux/mediatek/files-5.4/arch/arm64/boot/dts/mediatek/mt7988.dtsi#579
and the compatible for both (mt7988 and mt8195) are without "ap" there, so i took the same.
Where have you seen the mt7988-AP ?
i have only seen 1 sensor which should be the SOC itself and this chip has metal surface
which makes it hard to get real temperature (infrared thermometer gives wrong temp).
> Regards,
> Angelo
Powered by blists - more mailing lists