[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2a04d77c-52b8-47f5-b8e7-c36936db99ac@arm.com>
Date: Tue, 2 Apr 2024 10:51:48 +0100
From: Lukasz Luba <lukasz.luba@....com>
To: Nikita Travkin <nikitos.tr@...il.com>
Cc: Zhang Rui <rui.zhang@...el.com>, "Rafael J. Wysocki" <rafael@...nel.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] thermal: gov_power_allocator: Allow binding without
cooling devices
On 3/29/24 13:54, Nikita Travkin wrote:
> Hi Lukasz, Rafael!
>
> First of all, it seems like outlook mail server that serves @arm.com
> refuses
> to communicate with mine so I can't send anything to it and it refuses to
> send anything to me either...
>
> I'm sorry for any inconvenience this may cause and I will deal with it
> somehow...
>
> 28.03.2024 14:50, Lukasz Luba пишет:
>>
>>
>> On 3/28/24 09:12, Lukasz Luba wrote:
>>> Hi Rafael,
>>>
>>> On 3/27/24 15:25, Rafael J. Wysocki wrote:
>>>> On Thu, Mar 21, 2024 at 3:44 PM Nikita Travkin <nikita@...n.ru> wrote:
>>>>>
>>>>> Commit e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip
>>>>> points earlier")
>>>>> added a check that would fail binding the governer if there is no
>>>>> cooling devices bound to the thermal zone. Unfortunately this causes
>>>>> issues in cases when the TZ is bound to the governer before the
>>>>> cooling
>>>>> devices are attached to it. (I.e. when the tz is registered using
>>>>> thermal_zone_device_register_with_trips().)
>>>>>
>>>>> Additionally, the documentation across gov_power_allocator suggests
>>>>> it's
>>>>> intended to allow it to be bound to thermal zones without cooling
>>>>> devices (and thus without passive/active trip points), however the
>>>>> same
>>>>> change added a check for the trip point to be present, causing
>>>>> those TZ
>>>>> to fail probing.
>>
>> This patch description is mixing trips and cooling devices and refers to
>> a commit which is only for guarding the trip points number to be
>> not less than 2. In IPA we require 2 trip points.
>
> I'm sorry, I probably worded this poorly. I meant that there may be
> a TZ that has no trip points because it's not meant to be used with
> any cooling device.
The 'empty' thermal zone w/o trip points and w/o cooling devices, but
w/ governor looks odd but still live.
>
>>>>>
>>>>> Those changes cause all thermal zones to fail on some devices (such as
>>>>> sc7180-acer-aspire1) and prevent the kernel from controlling the
>>>>> cpu/gpu
>>>>> frequency based on the temperature, as well as losing all the other
>>>>> "informational" thermal zones if power_allocator is set as default.
>>>>>
>>>>> This commit partially reverts the referenced one by dropping the trip
>>>>> point check and by allowing the TZ to probe even if no actor buffer
>>>>> was
>>>>> allocated to allow those TZ to probe again.
>>>>>
>>>>> Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip
>>>>> points earlier")
>>
>> Not that commit.
>>>>> Signed-off-by: Nikita Travkin <nikita@...n.ru>
>>>>> ---
>>>>> I've noticed that all thermal zones fail probing with -EINVAL on my
>>>>> sc7180 based Acer Aspire 1 since 6.8. This commit allows me to bring
>>>>> them back.
>>>>
>>>> Łukasz, any comments?
>>>
>>> Let me check this today.
>>>>> ---
>>>>> drivers/thermal/gov_power_allocator.c | 14 +++++---------
>>>>> 1 file changed, 5 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/drivers/thermal/gov_power_allocator.c
>>>>> b/drivers/thermal/gov_power_allocator.c
>>>>> index 1b17dc4c219c..4f2d7f3b7508 100644
>>>>> --- a/drivers/thermal/gov_power_allocator.c
>>>>> +++ b/drivers/thermal/gov_power_allocator.c
>>>>> @@ -679,11 +679,6 @@ static int power_allocator_bind(struct
>>>>> thermal_zone_device *tz)
>>>>> return -ENOMEM;
>>>>>
>>>>> get_governor_trips(tz, params);
>>>>> - if (!params->trip_max) {
>>>>> - dev_warn(&tz->device, "power_allocator: missing
>>>>> trip_max\n");
>>
>> This if() guards the binding of TZ with less than 2 trip points,
>> not the cooling devices.
>>>>> - kfree(params);
>>>>> - return -EINVAL;
>>>>> - }
>>>>>
>>>>> ret = check_power_actors(tz, params);
>>>>> if (ret < 0) {
>>>>> @@ -693,7 +688,7 @@ static int power_allocator_bind(struct
>>>>> thermal_zone_device *tz)
>>>>> }
>>>>>
>>>>> ret = allocate_actors_buffer(params, ret);
>>>>> - if (ret) {
>>
>> This if() is from different commit.
>
> Hm you're right, I misread the commit I think and on a second thought
> this patch combines two different issues. I will split it up into two.
>
>>>>> + if (ret && ret != -EINVAL) {
>>
>> This is about 0 cooling devices in the thermal zone, but IPA won't work
>> so why to even fake and forward binding?
>>
>> Rafael should we support binding with 0 cooling devices?
>
> As I understand it, cooling devices can attach to existing thermal zones
> so it's possible that TZ is registered before any cooling devices has
> attached to it. This is exactly what happens in my case: first the
> IPA is probed for the TZ and after that the cooling device is attached.
That's fair enough. IPA should support it.
>
> check_power_actors() kerneldoc in IPA says:
>
>> * If all of the cooling devices currently attached to @tz implement
>> the power
>> * actor API, return the number of them (which may be 0, because some
>> cooling
>> * devices may be attached later). Otherwise, return -EINVAL.
>
> and afaiu this is exactly how it's used by
> thermal_zone_device_register_with_trips():
>
> result = thermal_set_governor(tz, governor);
> if (result) {
> mutex_unlock(&thermal_governor_lock);
> goto unregister;
> }
> (...)
> /* Bind cooling devices for this zone */
> bind_tz(tz);
>
>>>>> dev_warn(&tz->device, "power_allocator: allocation failed\n");
>>>>> kfree(params);
>>>>> return ret;
>>>>> @@ -714,9 +709,10 @@ static int power_allocator_bind(struct
>>>>> thermal_zone_device *tz)
>>>>> else
>>>>> params->sustainable_power =
>>>>> tz->tzp->sustainable_power;
>>>>>
>>>>> - estimate_pid_constants(tz, tz->tzp->sustainable_power,
>>>>> - params->trip_switch_on,
>>>>> - params->trip_max->temperature);
>>>>> + if (params->trip_max)
>>
>> This is not supported, we need those 2 trip points.
>
> Yes, I understand that IPA is useless without trip points and/or cooling
> devices. However from the snippet above the TZ won't be registered if
> the governor fails to probe. This means that if one has IPA set as the
> default, thermal core will try to probe it for TZ without trip points and,
> when it fails, will fail to probe the whole TZ.
>
> get_governor_trips() kerneldoc in IPA says:
>
> > * on. If there are no passive or active trip points, then the
> > * governor won't do anything. In fact, its throttle function
> > * won't be called at all.
>
> Which suggests to me that being bound to such TZ is expected.
>
> Unless you have any thoughts on how to solve this better, I will split
> this into two commits and resubmit in a few days:
Yes, please split into 2 patches for different commits.
>
> - Allow binding without cooling devices (allow -EINVAL after allocate)
It won't be the fix, it's a hack. If the 0 cooling devices is going to
be the normal state, then it has to be handled not as error exception
path.
Please change the allocate_actors_buffer():
if (!num_actors) {
ret = -EINVAL; <--- ret = 0 here
since it's normal state now.
> - Allow binding without trip points (Rest of this patch)
Yes, the rest of that patch looks good.
Powered by blists - more mailing lists