lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 8 Feb 2022 09:32:28 +0000
From:   Lukasz Luba <lukasz.luba@....com>
To:     Matthias Kaehlcke <mka@...omium.org>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        amit.kachhap@...il.com, daniel.lezcano@...aro.org,
        viresh.kumar@...aro.org, rafael@...nel.org, amitk@...nel.org,
        rui.zhang@...el.com, dietmar.eggemann@....com,
        Pierre.Gondois@....com, Douglas Anderson <dianders@...omium.org>,
        Stephen Boyd <swboyd@...omium.org>,
        Rajendra Nayak <rnayak@...eaurora.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>
Subject: Re: [PATCH 1/2] thermal: cooling: Check Energy Model type in
 cpufreq_cooling and devfreq_cooling



On 2/8/22 12:50 AM, Matthias Kaehlcke wrote:
> On Mon, Feb 07, 2022 at 07:30:35AM +0000, Lukasz Luba wrote:
>> The Energy Model supports power values either in Watts or in some abstract
>> scale. When the 2nd option is in use, the thermal governor IPA should not
>> be allowed to operate, since the relation between cooling devices is not
>> properly defined. Thus, it might be possible that big GPU has lower power
>> values in abstract scale than a Little CPU. To mitigate a misbehaviour
>> of the thermal control algorithm, simply not register a cooling device
>> capable of working with IPA.
> 
> Ugh, this would break thermal throttling for existing devices that are
> currently supported in the upstream kernel.

Could you point me to those devices? I cannot find them in the mainline
DT. There are no GPU devices which register Energy Model (EM) in
upstream, neither using DT (which would be power in mW) nor explicitly
providing EM get_power() callback. The EM is needed to have IPA.

Please clarify which existing devices are going to be broken with this
change.

> 
> Wasn't the conclusion that it is the responsability of the device tree
> owners to ensure that cooling devices with different scales aren't used
> in the same thermal zone?

It's based on assumption that someone has DT and control. There was also
implicit assumption that IPA would work properly on such platform - but
it won't.

1. You cannot have 2 thermal zones: one with CPUs and other with GPU
only and both working with two instances of IPA.

2. The abstract power scale doesn't guaranty anything about power values
and IPA was simply designed with milli-Watts in mind. So even working
on CPUs only using bogoWatts, is not what we could guaranty in IPA.

> 
> That's also what's currently specified in the power allocator
> documentation:
> 
>    Another important thing is the consistent scale of the power values
>    provided by the cooling devices. All of the cooling devices in a single
>    thermal zone should have power values reported either in milli-Watts
>    or scaled to the same 'abstract scale'.

This can change. We have removed the userspace governor from kernel
recently. The trend is to implement thermal policy in FW. Dealing with
some intermediate configurations are causing complicated design, support
of the algorithm logic is also more complex.

> 
> Which was actually added by yourself:
> 
> commit 5a64f775691647c242aa40d34f3512e7b179a921
> Author: Lukasz Luba <lukasz.luba@....com>
> Date:   Tue Nov 3 09:05:58 2020 +0000
> 
>      PM: EM: Clarify abstract scale usage for power values in Energy Model
> 
>      The Energy Model (EM) can store power values in milli-Watts or in abstract
>      scale. This might cause issues in the subsystems which use the EM for
>          estimating the device power, such as:
> 
>       - mixing of different scales in a subsystem which uses multiple
>              (cooling) devices (e.g. thermal Intelligent Power Allocation (IPA))
> 
>       - assuming that energy [milli-Joules] can be derived from the EM power
>              values which might not be possible since the power scale doesn't have
> 	           to be in milli-Watts
> 
>      To avoid misconfiguration add the requisite documentation to the EM and
>          related subsystems: EAS and IPA.
> 
>      Signed-off-by: Lukasz Luba <lukasz.luba@....com>
>      Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> 
> 
> It's ugly to have the abstract scales in the first place, but that's
> unfortunately what we currently have for at least some cooling devices.

A few questions:
Do you use 'we' as Chrome engineers?
Could you point me to those devices please?
Are they new or some old platforms which need just maintenance?
How IPA works for you in such real platform configuration?
If it would be possible could you share some plots of temperature,
frequency and CPUs, GPU utilization?
Do you maybe know how the real power was scaled for them?

It would help me understand and judge.

> 
> IMO it would be preferable to stick to catching incompliant configurations
> in reviews, rather than breaking thermal throttling for existing devices
> with configurations that comply with the current documentation.
> 

Without access to the source code of those devices, it's hard for me to
see if they are broken.

Regards,
Lukasz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ