linux-kernel - Re: [PATCH v3 6/7] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <421c43a9-c721-05eb-1860-dfb5c042bc95@linaro.org>
Date:   Mon, 5 Aug 2019 09:39:30 +0200
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Martin Kepplinger <martin.kepplinger@...i.sm>,
        viresh.kumar@...aro.org, kevin.wangtao@...aro.org,
        leo.yan@...aro.org, edubezval@...il.com,
        vincent.guittot@...aro.org, javi.merino@...nel.org,
        rui.zhang@...el.com, daniel.thompson@...aro.org
Cc:     linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 6/7] thermal/drivers/cpu_cooling: Introduce the cpu
 idle cooling driver

On 05/08/2019 08:53, Martin Kepplinger wrote:

[ ... ]

>>> +static s64 cpuidle_cooling_runtime(struct cpuidle_cooling_device *idle_cdev)
>>> +{
>>> +	s64 next_wakeup;
>>> +	unsigned long state = idle_cdev->state;
>>> +
>>> +	/*
>>> +	 * The function should not be called when there is no
>>> +	 * mitigation because:
>>> +	 * - that does not make sense
>>> +	 * - we end up with a division by zero
>>> +	 */
>>> +	if (!state)
>>> +		return 0;
>>> +
>>> +	next_wakeup = (s64)((idle_cdev->idle_cycle * 100) / state) -
>>> +		idle_cdev->idle_cycle;
>>> +
>>> +	return next_wakeup * NSEC_PER_USEC;
>>> +}
>>> +
>>
>> There is a bug in your calculation formula here when "state" becomes 100.
>> You return 0 for the injection rate, which is the same as "rate" being 0,
>> which is dangerous. You stop cooling when it's most necessary :)
>>
>> I'm not sure how much sense really being 100% idle makes, so I, when testing
>> this, just say if (state == 100) { state = 99 }. Anyways, just don't return 0.
>>
> 
> oh and also, this breaks S3 suspend:

What breaks the S3 suspend? The idle cooling device or the bug above ?

> Aug  5 06:09:20 pureos kernel: [  807.487887] PM: suspend entry (deep)
> Aug  5 06:09:40 pureos kernel: [  807.501148] Filesystems sync: 0.013
> seconds
> Aug  5 06:09:40 pureos kernel: [  807.501591] Freezing user space
> processes ... (elapsed 0.003 seconds) done.
> Aug  5 06:09:40 pureos kernel: [  807.504741] OOM killer disabled.
> Aug  5 06:09:40 pureos kernel: [  807.504744] Freezing remaining
> freezable tasks ...
> Aug  5 06:09:40 pureos kernel: [  827.517712] Freezing of tasks failed
> after 20.002 seconds (4 tasks refusing to freeze, wq_busy=0):
> Aug  5 06:09:40 pureos kernel: [  827.527122] thermal-idle/0  S    0
> 161      2 0x00000028
> Aug  5 06:09:40 pureos kernel: [  827.527131] Call trace:
> Aug  5 06:09:40 pureos kernel: [  827.527148]  __switch_to+0xb4/0x200
> Aug  5 06:09:40 pureos kernel: [  827.527156]  __schedule+0x1e0/0x488
> Aug  5 06:09:40 pureos kernel: [  827.527162]  schedule+0x38/0xc8
> Aug  5 06:09:40 pureos kernel: [  827.527169]  smpboot_thread_fn+0x250/0x2a8
> Aug  5 06:09:40 pureos kernel: [  827.527176]  kthread+0xf4/0x120
> Aug  5 06:09:40 pureos kernel: [  827.527182]  ret_from_fork+0x10/0x18
> Aug  5 06:09:40 pureos kernel: [  827.527186] thermal-idle/1  S    0
> 162      2 0x00000028
> Aug  5 06:09:40 pureos kernel: [  827.527192] Call trace:
> Aug  5 06:09:40 pureos kernel: [  827.527197]  __switch_to+0x188/0x200
> Aug  5 06:09:40 pureos kernel: [  827.527203]  __schedule+0x1e0/0x488
> Aug  5 06:09:40 pureos kernel: [  827.527208]  schedule+0x38/0xc8
> Aug  5 06:09:40 pureos kernel: [  827.527213]  smpboot_thread_fn+0x250/0x2a8
> Aug  5 06:09:40 pureos kernel: [  827.527218]  kthread+0xf4/0x120
> Aug  5 06:09:40 pureos kernel: [  827.527222]  ret_from_fork+0x10/0x18
> Aug  5 06:09:40 pureos kernel: [  827.527226] thermal-idle/2  S    0
> 163      2 0x00000028
> Aug  5 06:09:40 pureos kernel: [  827.527231] Call trace:
> Aug  5 06:09:40 pureos kernel: [  827.527237]  __switch_to+0xb4/0x200
> Aug  5 06:09:40 pureos kernel: [  827.527242]  __schedule+0x1e0/0x488
> Aug  5 06:09:40 pureos kernel: [  827.527247]  schedule+0x38/0xc8
> Aug  5 06:09:40 pureos kernel: [  827.527259]  smpboot_thread_fn+0x250/0x2a8
> Aug  5 06:09:40 pureos kernel: [  827.527264]  kthread+0xf4/0x120
> Aug  5 06:09:40 pureos kernel: [  827.527268]  ret_from_fork+0x10/0x18
> Aug  5 06:09:40 pureos kernel: [  827.527272] thermal-idle/3  S    0
> 164      2 0x00000028
> Aug  5 06:09:40 pureos kernel: [  827.527278] Call trace:
> Aug  5 06:09:40 pureos kernel: [  827.527283]  __switch_to+0xb4/0x200
> Aug  5 06:09:40 pureos kernel: [  827.527288]  __schedule+0x1e0/0x488
> Aug  5 06:09:40 pureos kernel: [  827.527293]  schedule+0x38/0xc8
> Aug  5 06:09:40 pureos kernel: [  827.527298]  smpboot_thread_fn+0x250/0x2a8
> Aug  5 06:09:40 pureos kernel: [  827.527303]  kthread+0xf4/0x120
> Aug  5 06:09:40 pureos kernel: [  827.527308]  ret_from_fork+0x10/0x18
> Aug  5 06:09:40 pureos kernel: [  827.527375] Restarting kernel threads
> ... done.
> Aug  5 06:09:40 pureos kernel: [  827.527771] OOM killer enabled.
> Aug  5 06:09:40 pureos kernel: [  827.527772] Restarting tasks ... done.
> Aug  5 06:09:40 pureos kernel: [  827.528926] PM: suspend exit
> 
> 
> do you know where things might go wrong here?
> 
> thanks,
> 
>                             martin
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog