lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9fb8fb88-73d7-771e-1309-4363907f7c01@quicinc.com>
Date:   Thu, 20 Jan 2022 00:35:37 +0530
From:   Manaf Meethalavalappu Pallikunhi <quic_manafm@...cinc.com>
To:     Thara Gopinath <thara.gopinath@...aro.org>,
        "Rafael J . Wysocki" <rafael@...nel.org>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        "Amit Kucheria" <amitk@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>,
        "Matthias Kaehlcke" <mka@...omium.org>, <thara.gopinath@...il.com>
CC:     <linux-pm@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] thermal/core: Clear all mitigation when thermal zone
 is disabled

Hi Rafael/Daniel,

Could you please check and comment  ?

Thanks,

Manaf

On 1/11/2022 2:15 AM, Manaf Meethalavalappu Pallikunhi wrote:
> Hi Thara,
>
> On 1/10/2022 11:25 PM, Thara Gopinath wrote:
>> Hi Manaf,
>>
>> On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
>>> Whenever a thermal zone is in trip violated state, there is a chance
>>> that the same thermal zone mode can be disabled either via thermal
>>> core API or via thermal zone sysfs. Once it is disabled, the framework
>>> bails out any re-evaluation of thermal zone. It leads to a case where
>>> if it is already in mitigation state, it will stay the same state
>>> until it is re-enabled.
>>>
>>> To avoid above mentioned issue, on thermal zone disable request
>>> reset thermal zone and clear mitigation for each trip explicitly.
>>>
>>> Signed-off-by: Manaf Meethalavalappu Pallikunhi 
>>> <quic_manafm@...cinc.com>
>>> ---
>>>   drivers/thermal/thermal_core.c | 12 ++++++++++--
>>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/thermal/thermal_core.c 
>>> b/drivers/thermal/thermal_core.c
>>> index 51374f4..e288c82 100644
>>> --- a/drivers/thermal/thermal_core.c
>>> +++ b/drivers/thermal/thermal_core.c
>>> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct 
>>> thermal_zone_device *tz,
>>>         thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>>   -    if (mode == THERMAL_DEVICE_ENABLED)
>>> +    if (mode == THERMAL_DEVICE_ENABLED) {
>>>           thermal_notify_tz_enable(tz->id);
>>> -    else
>>> +    } else {
>>> +        int trip;
>>> +
>>> +        /* make sure all previous throttlings are cleared */
>>> +        thermal_zone_device_init(tz);
>>
>> It looks weird to do a init when you are actually disabling the 
>> thermal zone.
>>
>>
>>> +        for (trip = 0; trip < tz->trips; trip++)
>>> +            handle_thermal_trip(tz, trip);
>>
>> So this is exactly what thermal_zone_device_update does except that 
>> thermal_zone_device_update checks for the mode and bails out if the 
>> zone is disabled.
>> This will work because as you explained in v2, the temperature is 
>> reset in thermal_zone_device_init and handle_thermal_trip will remove 
>> the mitigation if any.
>>
>> My two cents here (Rafael and Daniel can comment more on this).
>>
>> I think it will be cleaner if we can have a third mode 
>> THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle 
>> clearing the mitigation. So this will look like
>> if (mode == THERMAL_DEVICE_DISABLED)
>>     tz->mode = THERMAL_DEVICE_DISABLING;
>> else
>>     tz->mode = mode;
>>
>> thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>
>> if (mode == THERMAL_DEVICE_DISABLED)
>>     tz->mode = mode;
>>
>> You will have to update update_temperature to set tz->temperature = 
>> THERMAL_TEMP_INVALID and thermal_zone_set_trips to set 
>> tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
>> THERMAL_DEVICE_DISABLING mode.
>
> I think just updating above fields doesn't guarantee complete clearing 
> of mitigation for all governors. For  step_wise governor, to make sure 
> mitigation removed completely, we have to set each 
> thermal-instance->initialized = false as well.
>
> If we add that to above list of variables in update_temperature() 
> under if (mode == THERMAL_DEVICE_DISABLING) , it is same as 
> thermal_zone_device_init function does in current patch. We are just 
> resetting same fields in different place under a new mode, right ?
>
> Thanks,
>
> Manaf
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ