lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 20 Jun 2023 12:56:52 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     daniel.lezcano@...aro.org, rui.zhang@...el.com,
        Di Shen <di.shen@...soc.com>, amitk@...nel.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        xuewen.yan@...soc.com, jeson.gao@...soc.com, zhanglyra@...il.com,
        orsonzhai@...il.com
Subject: Re: [PATCH V4] thermal/core/power_allocator: reset thermal governor
 when trip point is changed



On 6/20/23 11:39, Rafael J. Wysocki wrote:
> On Tue, Jun 20, 2023 at 12:19 PM Lukasz Luba <lukasz.luba@....com> wrote:
>>
>> Hi Rafael,
>>
>>
>> On 6/20/23 11:07, Rafael J. Wysocki wrote:
>>> On Tue, Jun 20, 2023 at 11:46 AM Rafael J. Wysocki <rafael@...nel.org> wrote:
>>>>
>>>> On Mon, Jun 19, 2023 at 8:36 AM Di Shen <di.shen@...soc.com> wrote:
>>>>>
>>>>> When the thermal trip point is changed, the governor should
>>>>> be reset so that the policy algorithm be updated to adapt to the
>>>>> new trip point.
>>>>>
>>>>> This patch adds an ops for thermal the governor structure to reset
>>>>> the governor. The ops is called when the trip point is changed.
>>>>> For power allocator, the parameters of pid controller and the states
>>>>> of power cooling devices can be reset when the passive trip point
>>>>> is changed.
>>>>>
>>>>> Signed-off-by: Di Shen <di.shen@...soc.com>
>>>>>
>>>>> ---
>>>>> V4:
>>>>> - Compared to V3, handle it in thermal core instead of in governor.
>>>>>
>>>>> - Add an ops to the governor structure, and call it when a trip
>>>>>     point is changed.
>>>>>
>>>>> - Define reset ops for power allocator.
>>>>>
>>>>> V3:
>>>>> - Add fix tag.
>>>>>
>>>>> V2:
>>>>> - Compared to v1, do not revert.
>>>>>
>>>>> - Add a variable(last_switch_on_temp) in power_allocator_params
>>>>>     to record the last switch_on_temp value.
>>>>>
>>>>> - Adds a function to renew the update flag and update the
>>>>>     last_switch_on_temp when thermal trips are writable.
>>>>>
>>>>> V1:
>>>>> - Revert commit 0952177f2a1f.
>>>>> ---
>>>>> ---
>>>>>    drivers/thermal/gov_power_allocator.c | 21 +++++++++++++++++++++
>>>>>    drivers/thermal/thermal_trip.c        |  6 ++++++
>>>>>    include/linux/thermal.h               |  1 +
>>>>>    3 files changed, 28 insertions(+)
>>>>>
>>>>> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
>>>>> index 8642f1096b91..41d155adc616 100644
>>>>> --- a/drivers/thermal/gov_power_allocator.c
>>>>> +++ b/drivers/thermal/gov_power_allocator.c
>>>>> @@ -729,10 +729,31 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id)
>>>>>           return allocate_power(tz, trip.temperature);
>>>>>    }
>>>>>
>>>>> +static int power_allocator_reset(struct thermal_zone_device *tz, int trip_id)
>>>>> +{
>>>>> +       int ret = 0;
>>>>> +       struct thermal_trip trip;
>>>>> +       struct power_allocator_params *params = tz->governor_data;
>>>>> +
>>>>> +       ret = __thermal_zone_get_trip(tz, trip_id, &trip);
>>>>> +       if (ret)
>>>>> +               return ret;
>>>>> +
>>>>> +       /* Only need reset for passive trips */
>>>>> +       if (trip.type != THERMAL_TRIP_PASSIVE)
>>>>> +               return -EINVAL;
>>>>> +
>>>>> +       reset_pid_controller(params);
>>>>> +       allow_maximum_power(tz, true);
>>>>> +
>>>>> +       return ret;
>>>>> +}
>>>>> +
>>>>>    static struct thermal_governor thermal_gov_power_allocator = {
>>>>>           .name           = "power_allocator",
>>>>>           .bind_to_tz     = power_allocator_bind,
>>>>>           .unbind_from_tz = power_allocator_unbind,
>>>>>           .throttle       = power_allocator_throttle,
>>>>> +       .reset          = power_allocator_reset,
>>>>>    };
>>>>>    THERMAL_GOVERNOR_DECLARE(thermal_gov_power_allocator);
>>>>> diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c
>>>>> index 907f3a4d7bc8..52eb768fada8 100644
>>>>> --- a/drivers/thermal/thermal_trip.c
>>>>> +++ b/drivers/thermal/thermal_trip.c
>>>>> @@ -173,6 +173,12 @@ int thermal_zone_set_trip(struct thermal_zone_device *tz, int trip_id,
>>>>>           if (tz->trips && (t.temperature != trip->temperature || t.hysteresis != trip->hysteresis))
>>>>>                   tz->trips[trip_id] = *trip;
>>>>>
>>>>> +       if (t.temperature != trip->temperature && tz->governor && tz->governor->reset) {
>>>>> +               ret = tz->governor->reset(tz, trip_id);
>>>>> +               if (ret)
>>>>> +                       pr_warn_once("Failed to reset thermal governor\n");
>>>>
>>>> I'm not really sure if it is useful to print this message here.
>>>>
>>>> First off, the governors may print more precise diagnostic messages if
>>>> they care.
>>>>
>>>> Second, what is the sysadmin supposed to do in response to this message?
>>>
>>> In addition to the above, trip point temperatures may be updated in
>>> other places too, for instance in response to notifications from
>>> platform firmware and IMV this new callback should be also used in
>>> those cases.  However, in those cases multiple trip points may change
>>> at a time and the critical/hot trip point temperatures may be updated
>>> too AFAICS.
>>
>> IIRC the critical/hot trip points are handled differently, not using the
>> governors. The governors' 'throttle' callback would be called only
>> after we pass the test of 'critical/hot' [1].
> 
> OK, but is it actually useful to return an error code from the
> ->reset() callback when passed a non-passive trip point?

It will depend on the governor code. In our case the setup code
w.r.t. trip types is quite confusing (to fit into many possible
configurations). The non-passive trip point would be only
possible to bind when there are not other passive trip points.
That's is a really corner case and probably never used on any
device. Therefore, IMO we can just bail out in such situation
when then someone tries to update such single non-passive
trip point (probably not aware what is doing with IPA?).

For the rest of the governors - it's up to them what they
report in case non-passive trip is updated...

> 
>> What Di is facing is in the issue under the bucket of
>> 'handle_non_critical_trips()' when the governor just tries to
>> work on stale data - old trip temp.
> 
> Well, fair enough, but what about the other governors?  Is this
> problem limited to power_allocator?

IIUC the core fwk code - non of the governors would be needed
to handle the critical/hot trips. For the rest of the trip types
I would say it's up to the governor. In our IPA case this stale
data is used for power budget estimation - quite fundamental
step. Therefore, the reset and start from scratch would make more
sense.

I think other governors don't try to 'estimate' such
abstract power headroom based on temperature - so probably
they don't have to even implement the 'reset()' callback
(I don't know their logic that well).

> 
>> For the 2nd case IIUC the code, we pass the 'trip.temperature'
>> and should be ready for what you said (modification of that value).
> 
> Generally speaking, it needs to be prepared for a simultaneous change
> of multiple trip points (including active), in which case it may not
> be useful to invoke the ->reset() callback for each of them
> individually.

Although, that looks more cleaner IMO. Resetting one by one in
a temperature order would be easily maintainable, won't be?

> 
> Moreover, Daniel wants trip points to be sorted by temperature
> eventually and in that case indices may change overall even on one
> trip point update.

It's more complicated I think that this $subject, but very useful.


> 
>> Furthermore, the critical/hot situation is handled w/o governor
>> assistance, so we should be safe:
>> tz->ops->critical(tz) or tz->ops->hot(tz) and not
>> tz->governor->throttle(tz, trip)
> 
> That's fine.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ