lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f0710d5-cd78-dfff-1ce2-bba5f6e469b7@arm.com>
Date:   Tue, 6 Apr 2021 09:44:02 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     Daniel Lezcano <daniel.lezcano@...aro.org>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        amitk@...nel.org, rui.zhang@...el.com
Subject: Re: [PATCH 1/2] thermal: power_allocator: maintain the device
 statistics from going stale



On 4/2/21 4:54 PM, Daniel Lezcano wrote:
> On 31/03/2021 18:33, Lukasz Luba wrote:
>> When the temperature is below the first activation trip point the cooling
>> devices are not checked, so they cannot maintain fresh statistics. It
>> leads into the situation, when temperature crosses first trip point, the
>> statistics are stale and show state for very long period.
> 
> Can you elaborate the statistics you are referring to ?
> 
> I can understand the pid controller needs temperature but I don't
> understand the statistics with the cooling device.
> 
> 

The allocate_power() calls cooling_ops->get_requested_power(),
which is for CPUs cpufreq_get_requested_power() function.
In that cpufreq implementation for !SMP we still has the
issue of stale statistics. Viresh, when he introduced the usage
of sched_cpu_util(), he fixed that 'long non-meaningful period'
of the broken statistics and it can be found since v5.12-rc1.

The bug is still there for the !SMP. Look at the way how idle time
is calculated in get_load() [1]. It relies on 'idle_time->timestamp'
for calculating the period. But when this function is not called,
the value can be very far away in time, e.g. a few seconds back,
when the last allocate_power() was called.

The bug is there for both SMP and !SMP [2] for older kernels, which can
be used in Android or ChromeOS. I've been considering to put this simple
IPA fix also to some other kernels, because Viresh's change is more
a 'feature' and does not cover both platforms.

Regards,
Lukasz

[1] 
https://elixir.bootlin.com/linux/v5.12-rc5/source/drivers/thermal/cpufreq_cooling.c#L156
[2] 
https://elixir.bootlin.com/linux/v5.11.11/source/drivers/thermal/cpufreq_cooling.c#L143

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ