lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02ed646e-b344-4802-a4ef-806a1e0cac67@linaro.org>
Date: Mon, 8 Jul 2024 16:32:41 +0200
From: Daniel Lezcano <daniel.lezcano@...aro.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>,
 Linux PM <linux-pm@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
 Lukasz Luba <lukasz.luba@....com>
Subject: Re: [PATCH v2 2/2] thermal: core: Add sanity check for polling_delay
 and passive_delay

On 08/07/2024 16:03, Rafael J. Wysocki wrote:
> On Mon, Jul 8, 2024 at 3:58 PM Daniel Lezcano <daniel.lezcano@...aro.org> wrote:
>>
>> On 08/07/2024 15:38, Rafael J. Wysocki wrote:
>>> On Mon, Jul 8, 2024 at 2:12 PM Daniel Lezcano <daniel.lezcano@...aro.org> wrote:
>>>>
>>>> On 05/07/2024 21:46, Rafael J. Wysocki wrote:
>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>>>>
>>>>> If polling_delay is nonzero and passive_delay is 0, the thermal zone
>>>>> will use polling except when tz->passive is nonzero, which does not make
>>>>> sense.
>>>>>
>>>>> Also if polling_delay is nonzero and passive_delay is greater than
>>>>> polling_delay, the thermal zone temperature will be updated less often
>>>>> when tz->passive is nonzero.  This does not make sense either.
>>>>>
>>>>> Ensure that none of the above will happen.
>>>>>
>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>>>> ---
>>>>>
>>>>> v1 -> v2: The patch actually matches the changelog
>>>>>
>>>>> ---
>>>>>     drivers/thermal/thermal_core.c |    3 +++
>>>>>     1 file changed, 3 insertions(+)
>>>>>
>>>>> Index: linux-pm/drivers/thermal/thermal_core.c
>>>>> ===================================================================
>>>>> --- linux-pm.orig/drivers/thermal/thermal_core.c
>>>>> +++ linux-pm/drivers/thermal/thermal_core.c
>>>>> @@ -1440,6 +1440,9 @@ thermal_zone_device_register_with_trips(
>>>>>                 td->threshold = INT_MAX;
>>>>>         }
>>>>>
>>>>> +     if (polling_delay && (passive_delay > polling_delay || !passive_delay))
>>>>> +             passive_delay = polling_delay;
>>>>
>>>> Given this is a system misconfiguration, it would make more sense to
>>>> bail out with -EINVAL. Assigning a default value in the back of the
>>>> caller will never raise its attention and can make a bad configuration
>>>> staying for a long time.
>>>
>>> This works except for the case mentioned below.
>>>
>>> I think that passive_delay > polling_delay can trigger a -EINVAL, but
>>> (polling_delay && !passive_delay) cannot do it because it is regarded
>>> as a valid case as per the below.
>>
>> Right I can see ATM only this as an illogic combination:
>>
>>          polling_delay && passive_delay &&
>>          (polling_delay < passive_delay)
>>
>>>> That said, there are configurations with a passive delay set to zero but
>>>> with a non zero polling delay. For instance, a thermal zone mitigated
>>>> with a fan, so active trip points are set. Another example is when there
>>>> is only critical trip points for a thermal zone.
>>>>
>>>> Actually there are multiple combinations with delays value which may
>>>> look invalid but which are actually valid.
>>>>
>>>> For example, a setup with polling_delay > 0, passive_delay = 0, active
>>>> trip points, cooling map to this active trips, passive trip points
>>>> without cooling map.
>>>>
>>>> IMHO, it is better to do the configuration the system is asking for,
>>>> even if it sounds weird
>>>
>>> Except that it doesn't work as expected because if passive_delay = 0,
>>> polling is paused when tz->passive is set.
>>
>> Yes, but as there is no cooling map, there is no governor action, thus
>> tz->passive is never set.
> 
> In current linux-next, it is set when a passive trip is crossed on the way up.

Ah, I see. AFAIR that was the gov_step_wise which was changing this 
value but based on the thermal instance.

>> So we can have a passive polling equal to zero
>> without being illegal as no passive mitigation will happen.
>>
>> The passive delay is really there only if there is a passive cooling
>> device mapped to a passive trip point.
> 
> Well, shouldn't user space get notified more often when passive
> cooling is under way?

(Assuming you meant "user space get notified when a passive trip point 
is crossed")

Mmh, yes. I see the point.


>> The polling delay is in charge of mitigating the active cooling device
>> like a fan. So it is possible to mix an active trip point to mitigate
>> with a fan and then put at a higher temperature a passive trip point
>> with a higher sampling resolution.
> 
> But it is not correct to pause polling when tz->passive is set.

I'm not sure to get the comment.

Just to clarify:

trip A is active with a multi speed fan, polling every 1s

trip B is passive with a cpufreq cooling device, polling every 100ms

temp(tripA) < temp(tripB)

When the trip A is crossed, the mitigation happens at <polling> rate. 
Assuming it fails to cool down, the fan continues to increase its speed 
until it reaches its max state.

The temperature continues to increase and crosses the passive trip 
point. The fan speed stays at its maximum and the polling switches to 
the passive polling delay.


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ