lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yg2WSys4uxONzSSl@google.com>
Date:   Wed, 16 Feb 2022 16:26:51 -0800
From:   Matthias Kaehlcke <mka@...omium.org>
To:     Lukasz Luba <lukasz.luba@....com>
Cc:     Doug Anderson <dianders@...omium.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        amit daniel kachhap <amit.kachhap@...il.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Amit Kucheria <amitk@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Pierre.Gondois@....com, Stephen Boyd <swboyd@...omium.org>,
        Rajendra Nayak <rnayak@...eaurora.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>
Subject: Re: [PATCH 1/2] thermal: cooling: Check Energy Model type in
 cpufreq_cooling and devfreq_cooling

On Wed, Feb 16, 2022 at 10:43:34PM +0000, Lukasz Luba wrote:
> 
> 
> On 2/16/22 10:13 PM, Matthias Kaehlcke wrote:
> > On Wed, Feb 16, 2022 at 09:33:50AM -0800, Doug Anderson wrote:
> > > Hi,
> > > 
> > > On Wed, Feb 16, 2022 at 7:35 AM Lukasz Luba <lukasz.luba@....com> wrote:
> > > > 
> > > > Hi Matthias,
> > > > 
> > > > On 2/9/22 10:17 PM, Matthias Kaehlcke wrote:
> > > > > On Wed, Feb 09, 2022 at 11:16:36AM +0000, Lukasz Luba wrote:
> > > > > > 
> > > > > > 
> > > > > > On 2/8/22 5:25 PM, Matthias Kaehlcke wrote:
> > > > > > > On Tue, Feb 08, 2022 at 09:32:28AM +0000, Lukasz Luba wrote:
> > > > > > > > 
> > > > > > > > 
> > > > 
> > > > [snip]
> > > > 
> > > > > > > > Could you point me to those devices please?
> > > > > > > 
> > > > > > > arch/arm64/boot/dts/qcom/sc7180-trogdor-*
> > > > > > > 
> > > > > > > Though as per above they shouldn't be impacted by your change, since the
> > > > > > > CPUs always pretend to use milli-Watts.
> > > > > > > 
> > > > > > > [skipped some questions/answers since sc7180 isn't actually impacted by
> > > > > > >     the change]
> > > > > > 
> > > > > > Thank you Matthias. I will investigate your setup to get better
> > > > > > understanding.
> > > > > 
> > > > > Thanks!
> > > > > 
> > > > 
> > > > I've checked those DT files and related code.
> > > > As you already said, this patch is safe for them.
> > > > So we can apply it IMO.
> > > > 
> > > > 
> > > > -------------Off-topic------------------
> > > > Not in $subject comments:
> > > > 
> > > > AFAICS based on two files which define thermal zones:
> > > > sc7180-trogdor-homestar.dtsi
> > > > sc7180-trogdor-coachz.dtsi
> > > > 
> > > > only the 'big' cores are used as cooling devices in the
> > > > 'skin_temp_thermal' - the CPU6 and CPU7.
> > > > 
> > > > I assume you don't want to model at all the power usage
> > > > from the Little cluster (which is quite big: 6 CPUs), do you?
> > > > I can see that the Little CPUs have small dyn-power-coeff
> > > > ~30% of the big and lower max freq, but still might be worth
> > > > to add them to IPA. You might give them more 'weight', to
> > > > make sure they receive more power during power split.
> > 
> > In experiments we saw that including the little cores as cooling
> > devices for 'skin_temp_thermal' didn't have a significant impact on
> > thermals, so we left them out.
> > 
> > > > You also don't have GPU cooling device in that thermal zone.
> > > > Based on my experience if your GPU is a power hungry one,
> > > > e.g. 2-4Watts, you might get better results when you model
> > > > this 'hot' device (which impacts your temp sensor reported value).
> > > 
> > > I think the two boards you point at (homestar and coachz) are just the
> > > two that override the default defined in the SoC dtsi file. If you
> > > look in sc7180.dtsi you'll see 'gpuss1-thermal' which has a cooling
> > > map. You can also see the cooling maps for the littles.
> > 
> > Yep, plus thermal zones with cooling maps for the big cores.
> > 
> > > I guess we don't have a `dynamic-power-coefficient` for the GPU,
> > > though? Seems like we should, but I haven't dug through all the code
> > > here...
> > 
> > To my knowledge the SC7x80 GPU doesn't register an energy model, which is
> > one of the reasons the GPU wasn't included as cooling device for
> > 'skin_temp_thermal'.
> > 
> 
> You can give it a try by editing the DT and adding in the
> GPU node the 'dynamic-power-coefficient' + probably
> small modification in the driver code.
> 
> If the GPU driver registers the cooling device in the new way, you
> would also get EM registered thanks to the devfreq cooling new code
> (commit: 84e0d87c9944eb36ae6037a).
> 
> You can check an example from Panfrost GPU driver [1].

Ah, I missed that, thanks for the pointer!

> I can see some upstream MSM GPU driver, but I don't know if that is
> your GPU driver. It registers the 'old' way the devfreq cooling [2]
> but it would be easy to change to use the new function.
> The GPU driver would use the same dev_pm_opp_of_register_em() as
> your CPUs do, so EM would be in 'milli-Watts' (so should be fine).

Yep, that's whay we are using.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ