lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ac3b6b0-0e3f-3971-78e6-362c367f8c30@arm.com>
Date:   Thu, 22 Oct 2020 12:45:41 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     Daniel Lezcano <daniel.lezcano@...aro.org>,
        linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        dri-devel@...ts.freedesktop.org
Cc:     rui.zhang@...el.com, amit.kucheria@...durent.com,
        orjan.eide@....com, robh@...nel.org,
        alyssa.rosenzweig@...labora.com, steven.price@....com,
        airlied@...ux.ie, daniel@...ll.ch
Subject: Re: [PATCH 2/5] thermal: devfreq_cooling: get a copy of device status

Hi Daniel,

On 10/14/20 3:34 PM, Daniel Lezcano wrote:
> On 21/09/2020 14:20, Lukasz Luba wrote:
>> Devfreq cooling needs to now the correct status of the device in order
>> to operate. Do not rely on Devfreq last_status which might be a stale data
>> and get more up-to-date values of the load.
>>
>> Devfreq framework can change the device status in the background. To
>> mitigate this situation make a copy of the status structure and use it
>> for internal calculations.
>>
>> In addition this patch adds normalization function, which also makes sure
>> that whatever data comes from the device, it is in a sane range.
>>
>> Signed-off-by: Lukasz Luba <lukasz.luba@....com>
>> ---
>>   drivers/thermal/devfreq_cooling.c | 52 +++++++++++++++++++++++++------
>>   1 file changed, 43 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
>> index 7063ccb7b86d..cf045bd4d16b 100644
>> --- a/drivers/thermal/devfreq_cooling.c
>> +++ b/drivers/thermal/devfreq_cooling.c
>> @@ -227,6 +227,24 @@ static inline unsigned long get_total_power(struct devfreq_cooling_device *dfc,
>>   							       voltage);
>>   }
>>   
>> +static void _normalize_load(struct devfreq_dev_status *status)
>> +{
>> +	/* Make some space if needed */
>> +	if (status->busy_time > 0xffff) {
>> +		status->busy_time >>= 10;
>> +		status->total_time >>= 10;
>> +	}
>> +
>> +	if (status->busy_time > status->total_time)
>> +		status->busy_time = status->total_time;
>> +
>> +	status->busy_time *= 100;
>> +	status->busy_time /= status->total_time ? : 1;
>> +
>> +	/* Avoid division by 0 */
>> +	status->busy_time = status->busy_time ? : 1;
>> +	status->total_time = 100;
>> +}
> 
> Not sure that works if the devfreq governor is not on-demand.
> 
> Is it possible to use the energy model directly in devfreq to compute
> the energy consumption given the state transitions since the last reading ?

This change is actually trying to create a safety net for what we do.

In the original code we take the last_state directly:
-	struct devfreq_dev_status *status = &df->last_status;

Then we simply multiply by 'busy_time' and div by 'total_time',
without checks... The values might be huge or zero, etc.
The _normalize_load() introduces this safety.

Apart from that, just simply taking a pointer to &df->last_status does
not protect us from:
- working on a struct which might be modified at the same time in
   background - not safe
- that struct might not be updated by long time, because devfreq
   didn't check it for a long (there are two polling modes in devfreq)

So taking a mutex and then a trigger the device status check and
make a copy of newest data. It is more safe.

I think this can be treated as a fix, not a feature.

> 
> The power will be read directly from devfreq which will return:
> 
> nrj + (current_power * (jiffies - last_update)) / (jiffies - last_update)
> 
> The devfreq cooling device driver would result in a much simpler code, no?

This is something that I would like to address after the EM changes are
merged. It would be the next step, how to estimate the power by taking
into consideration more information. This patch series just tries to
make it possible to use EM. The model improvements would be next.

Thank you Daniel for your review.

Regards,
Lukasz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ