[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHCN7xKfeh-cqJVfbW_km27cgee2MEBdPM3edACRi0fCaohxvw@mail.gmail.com>
Date: Fri, 13 Sep 2019 10:01:55 -0500
From: Adam Ford <aford173@...il.com>
To: "H. Nikolaus Schaller" <hns@...delico.com>
Cc: Linux-OMAP <linux-omap@...r.kernel.org>,
Tony Lindgren <tony@...mide.com>,
André Roth <neolynx@...il.com>,
Discussions about the Letux Kernel
<letux-kernel@...nphoenux.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andreas Kemnade <andreas@...nade.info>,
Nishanth Menon <nm@...com>, Adam Ford <adam.ford@...icpd.com>,
kernel@...a-handheld.com
Subject: Re: [RFC] ARM: dts: omap36xx: Enable thermal throttling
On Fri, Sep 13, 2019 at 9:24 AM H. Nikolaus Schaller <hns@...delico.com> wrote:
>
>
> > Am 13.09.2019 um 16:05 schrieb Adam Ford <aford173@...il.com>:
> >
> > On Fri, Sep 13, 2019 at 8:32 AM H. Nikolaus Schaller <hns@...delico.com> wrote:
> >>
> >> Hi Adam,
> >>
> >>> Am 13.09.2019 um 13:07 schrieb Adam Ford <aford173@...il.com>:
> >>
> >>>>> + cpu_cooling_maps: cooling-maps {
> >>>>> + map0 {
> >>>>> + trip = <&cpu_alert0>;
> >>>>> + /* Only allow OPP50 and OPP100 */
> >>>>> + cooling-device = <&cpu 0 1>;
> >>>>
> >>>> omap4-cpu-thermal.dtsi uses THERMAL_NO_LIMIT constants but I do not
> >>>> understand their meaning (and how it relates to the opp list).
> >>>
> >>> I read through the documentation, but it wasn't completely clear to
> >>> me. AFAICT, the numbers after &cpu represent the min and max index in
> >>> the OPP table when the condition is hit.
> >>
> >> Ok. It seems to use "cooling state" for those and the first is minimum
> >> and the last is maximum. Using THERMAL_NO_LIMIT (-1UL) means to have
> >> no limits.
> >>
> >> Since here we use the &cpu node it is likely that the "cooling state"
> >> is the same as the OPP index currently in use.
> >>
> >> I have looked through the .dts which use cpu_crit and the picture is
> >> not unique...
> >>
> >> omap4 seems to only define it
> >> am57xx has two different grade dtsi files
> >> dra7 overwrites critical temperature value
> >> am57xx-beagle defines a gpio to control a fan
> >
I am going to push a separate but related RFC with 2 patches in the
series. This new one will setup the alerts and maps without any
throttling for all omap3's in the first patch. The second patch will
consolidate the thermal references to omap3.dtsi so omap34, omap36 and
am35 can all use them without having to duplicate the entries.
It will make the omap36xx changes simpler to manage, because we can
just modify a portion of the entries instead of having the whole
table.
Once this parallel RFC gets comments/feedback, I'll re-integrate the
omap36xx throttling.
adam
> > Checkout rk3288-veyron-mickey.dts
> >
> > They have almost_warm, warm, almost_hot, hot, hotter, very_hot, and
> > critical for trips, and they have as many corresponding cooling maps
> > which appear to limit the CPU speeds, but their index references are
> > still confusing to me.
>
> Seems to be quite sophistcated.
>
> The arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi also has a lot
> of trip points. So there may be very different needs...
>
> But it has potentially helpful comments...
>
> /*
> * When reaching cpu0_alert3, reduce CPU
> * by 2 steps. On Exynos5422/5800 that would
> * be: 1600 MHz and 1100 MHz.
> */
> map3 {
> trip = <&cpu0_alert3>;
> cooling-device = <&cpu0 0 2>;
> };
> map4 {
> trip = <&cpu0_alert3>;
> cooling-device = <&cpu4 0 2>;
> };
> /*
> * When reaching cpu0_alert4, reduce CPU
> * further, down to 600 MHz (12 steps for big,
> * 7 steps for LITTLE).
> */
> map5 {
> trip = <&cpu0_alert4>;
> cooling-device = <&cpu0 3 7>;
> };
> map6 {
> trip = <&cpu0_alert4>;
> cooling-device = <&cpu4 3 12>;
> };
>
> That would mean the second integer is something about how
> many steps to reduce.
>
> But the first is not explained.
>
> BTW: this also demonstrates how a single trip point can map to multiple
> cooling-device actions (something we likely do not need).
>
> >
> > For that device,
> > Warm and no limit first, then 4: coolling-device = <&cpu0 THERMAL_NO_LIMIT 4>
> > ...
> > very_hot uses a number then no limit: cooling-device = <&cpu0 8
> > THERMAL_NO_LIMIT>
> >
> > This makes me wonder if the min and max are switched or the index
> > values go backwards.
>
> It may depend on the specific cpu driver? Maybe even omap rk and exynos
> have different interpretation in code?
>
> >>
> >> Then we can use the data sheet limits of 90°C and 105°C in the trip point
> >> table (which should not be tweaked for sensor inaccuracy).
> >
> > I can see not compensating if it reads high, but if the temp reads
> > low, shouldn't compensate so we don't over temp the processor?
>
> I just mean that we must ensure that the TJ is <= 90° if the bandgap
> ever reports 90°. So it may report 10 or 20 or even 30 degrees more than the
> real temperature but never less (reaching the critical temperature too early
> but not too late).
>
> We can achieve that by adding bias or changing slope etc. in the bandgap sensor
> driver.
>
> If I find some time I am curious enough to look into the code and the data
> sheets to understand why it is said to be inaccurate... Maybe there is
> jitter from some LDO and it needs a median filter?
>
> And why it seems to add a bias of ca. 10° as soon as I read it more than
> for the first time. And how well temperature correlates to ambient temperature
> (it definitively correlates to cpufreq-set -f).
>
> But we should not modify the trip temperatures by 10 or 20 or 30 degrees.
> IMHO they should have the values defined by the data sheet.
>
> BR,
> Nikolaus
>
Powered by blists - more mailing lists