linux-kernel - Re: [PATCH V4] thermal/core/power_allocator: reset thermal governor when trip point is changed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <86da9945-04d5-047a-cb2d-5fb63737839f@arm.com>
Date:   Fri, 23 Jun 2023 18:34:31 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Di Shen <di.shen@...soc.com>
Cc:     daniel.lezcano@...aro.org, rui.zhang@...el.com, amitk@...nel.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        xuewen.yan@...soc.com, jeson.gao@...soc.com, zhanglyra@...il.com,
        orsonzhai@...il.com
Subject: Re: [PATCH V4] thermal/core/power_allocator: reset thermal governor
 when trip point is changed



On 6/23/23 17:55, Rafael J. Wysocki wrote:
> On Fri, Jun 23, 2023 at 9:43 AM Lukasz Luba <lukasz.luba@....com> wrote:
>>
>>
>>

[snip]

>>
>> I agree, the patch header doesn't explain that properly. Here is the
>> explanation for this Intelligent Power Allocator (IPA):
>>
>> The IPA controls temperature using PID mechanism. It's a closed
>> feedback loop. That algorithm can 'learn' from the 'observed'
>> in the past reaction for it's control decisions and accumulates that
>> information in the part called 'error integral'. Those accumulated
>> 'error' gaps are the differences between the set target value and the
>> actually achieved value. In our case the target value is the target
>> temperature which is coming from the trip point. That part is then used
>> with the 'I' (of PID) component, so we can compensate for those
>> 'learned' mistakes.
>> Now, when you change the target temperature value - all your previous
>> learned errors won't help you. That's why Intelligent Power Allocator
>> should reset previously accumulated history.
> 
> Right.
> 
> And every other governor using information from the past for control
> will have an analogous problem, won't it?

Not necessarily, but to play safe I would go case-by-case and make
sure other governors are aligned to this new feature.

E.g. the bang-bang governor operates only on current temperature and
current trip value + trip hysteresis. The flow graph describes it [1].
The control (state of the fan: ON or OFF) of that governor could be
simply adjusted to the new reality -> new trip point temp. That would
just mean 'toggling' the fan if needed. There are only 2 'target'
states: 0 or 1 for the fan. You can images a situation when the
temperature doesn't change, but we manipulate the trip value for that
governor. The governor would react correctly always in such situation
w/o a need of a reset IMO.

> 
>>>
>>>>>
>>>>>> For the 2nd case IIUC the code, we pass the 'trip.temperature'
>>>>>> and should be ready for what you said (modification of that value).
>>>>>
>>>>> Generally speaking, it needs to be prepared for a simultaneous change
>>>>> of multiple trip points (including active), in which case it may not
>>>>> be useful to invoke the ->reset() callback for each of them
>>>>> individually.
>>>>
>>>> Although, that looks more cleaner IMO. Resetting one by one in
>>>> a temperature order would be easily maintainable, won't be?
>>>
>>> I wouldn't call it maintainable really.
>>>
>>> First of all, the trips may not be ordered.  There are no guarantees
>>> whatsoever that they will be ordered, so the caller may have to
>>> determine the temperature order in the first place.  This would be an
>>> extra requirement that currently is not there.
>>>
>>> Apart from this, I don't see any fundamental difference between the
>>> case when trip points are updated via sysfs and when they are updated
>>> by the driver.  The governor should reset itself in any of those cases
>>> and even if one trip point changes, the temperature order of all of
>>> them may change, so the governor reset mechanism should be able to
>>> handle the case when multiple trip points are updated at the same
>>> time.  While you may argue that this is theoretical, the ACPI spec
>>> clearly states that this is allowed to happen, for example.
>>>
>>> If you want a generic reset callback for governors, that's fine, but
>>> make it generic and not specific to a particular use case.
>>
>> I think we agree here, but probably having slightly different
>> implementation in mind. Based on you explanation I think you
>> want simply this API:
>> void (*reset)(struct thermal_zone_device *tz);
>>
>> 1. no return value
>> 2. no specific trip ID as argument
>>
>> Do you agree?
> 
> Yes, I do.

OK, thanks.

Di could you implement that 'reset()' API according to this description,
please?

> 
>> IMO such implementation and API would also work for this IPA
>> purpose. Would that work for the ACPI use case as well?
> 
> It would AFAICS.

Thanks Rafael for the comments and the progress that we made :)

Regards,
Lukasz

[1] 
https://elixir.bootlin.com/linux/v6.3/source/drivers/thermal/gov_bang_bang.c#L80