lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c29c833-b558-f0ab-83ab-08371785ffd1@quicinc.com>
Date:   Tue, 11 Jan 2022 02:15:51 +0530
From:   Manaf Meethalavalappu Pallikunhi <quic_manafm@...cinc.com>
To:     Thara Gopinath <thara.gopinath@...aro.org>,
        "Rafael J . Wysocki" <rafael@...nel.org>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        "Amit Kucheria" <amitk@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>,
        "Matthias Kaehlcke" <mka@...omium.org>, <thara.gopinath@...il.com>
CC:     <linux-pm@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] thermal/core: Clear all mitigation when thermal zone
 is disabled

Hi Thara,

On 1/10/2022 11:25 PM, Thara Gopinath wrote:
> Hi Manaf,
>
> On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
>> Whenever a thermal zone is in trip violated state, there is a chance
>> that the same thermal zone mode can be disabled either via thermal
>> core API or via thermal zone sysfs. Once it is disabled, the framework
>> bails out any re-evaluation of thermal zone. It leads to a case where
>> if it is already in mitigation state, it will stay the same state
>> until it is re-enabled.
>>
>> To avoid above mentioned issue, on thermal zone disable request
>> reset thermal zone and clear mitigation for each trip explicitly.
>>
>> Signed-off-by: Manaf Meethalavalappu Pallikunhi 
>> <quic_manafm@...cinc.com>
>> ---
>>   drivers/thermal/thermal_core.c | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/thermal/thermal_core.c 
>> b/drivers/thermal/thermal_core.c
>> index 51374f4..e288c82 100644
>> --- a/drivers/thermal/thermal_core.c
>> +++ b/drivers/thermal/thermal_core.c
>> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct 
>> thermal_zone_device *tz,
>>         thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>   -    if (mode == THERMAL_DEVICE_ENABLED)
>> +    if (mode == THERMAL_DEVICE_ENABLED) {
>>           thermal_notify_tz_enable(tz->id);
>> -    else
>> +    } else {
>> +        int trip;
>> +
>> +        /* make sure all previous throttlings are cleared */
>> +        thermal_zone_device_init(tz);
>
> It looks weird to do a init when you are actually disabling the 
> thermal zone.
>
>
>> +        for (trip = 0; trip < tz->trips; trip++)
>> +            handle_thermal_trip(tz, trip);
>
> So this is exactly what thermal_zone_device_update does except that 
> thermal_zone_device_update checks for the mode and bails out if the 
> zone is disabled.
> This will work because as you explained in v2, the temperature is 
> reset in thermal_zone_device_init and handle_thermal_trip will remove 
> the mitigation if any.
>
> My two cents here (Rafael and Daniel can comment more on this).
>
> I think it will be cleaner if we can have a third mode 
> THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle 
> clearing the mitigation. So this will look like
> if (mode == THERMAL_DEVICE_DISABLED)
>     tz->mode = THERMAL_DEVICE_DISABLING;
> else
>     tz->mode = mode;
>
> thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>
> if (mode == THERMAL_DEVICE_DISABLED)
>     tz->mode = mode;
>
> You will have to update update_temperature to set tz->temperature = 
> THERMAL_TEMP_INVALID and thermal_zone_set_trips to set 
> tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
> THERMAL_DEVICE_DISABLING mode.

I think just updating above fields doesn't guarantee complete clearing 
of mitigation for all governors. For  step_wise governor, to make sure 
mitigation removed completely, we have to set each 
thermal-instance->initialized = false as well.

If we add that to above list of variables in update_temperature() under 
if (mode == THERMAL_DEVICE_DISABLING) , it is same as 
thermal_zone_device_init function does in current patch. We are just 
resetting same fields in different place under a new mode, right ?

Thanks,

Manaf

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ