lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 17 Jul 2023 14:47:50 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     Qais Yousef <qyousef@...alina.io>
Cc:     rafael@...nel.org, daniel.lezcano@...aro.org,
        Kajetan Puchalski <kajetan.puchalski@....com>,
        Dietmar.Eggemann@....com, dsmythies@...us.net,
        yu.chen.surf@...il.com, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 2/2] cpuidle: teo: Introduce util-awareness

Hi Qais,

The rule is 'one size doesn't fit all', please see below.

On 7/11/23 18:58, Qais Yousef wrote:
> Hi Kajetan
> 
> On 01/05/23 14:51, Kajetan Puchalski wrote:
> 
> [...]
> 
>> @@ -510,9 +598,11 @@ static int teo_enable_device(struct cpuidle_driver *drv,
>>   			     struct cpuidle_device *dev)
>>   {
>>   	struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu);
>> +	unsigned long max_capacity = arch_scale_cpu_capacity(dev->cpu);
>>   	int i;
>>   
>>   	memset(cpu_data, 0, sizeof(*cpu_data));
>> +	cpu_data->util_threshold = max_capacity >> UTIL_THRESHOLD_SHIFT;
> 
> Given that utilization is invariant, why do we set the threshold based on
> cpu capacity?


To treat CPUs differently, not with the same policy.


> 
> I'm not sure if this is a problem, but on little cores this threshold would be
> too low. Given that util is invariant - I wondered if we need to have a single
> threshold for all type of CPUs instead. Have you tried something like that

A single threshold for all CPUs might be biased towards some CPUs. Let's
pick the value 15 - which was tested to work really good in benchmarks
for the big CPUs. On the other hand when you set that value to little
CPUs, with max_capacity = 124, than you have 15/124 ~= 13% threshold.
That means you prefer to enter deeper idle state ~9x times (at max
freq). What if the Little's freq is set to e.g. < ~20% fmax, which
corresponds to capacity < ~25? Let's try to simulate such scenario.

In a situation we could have utilization 14 on Little CPU, than CPU 
capacity (effectively frequency) voting based on utilization would be
1.2 * 14 = ~17 so let's pick OPP corresponding to 17 capacity.
In such condition the little CPU would run the 14-util-periodic-task for
14/17= ~82% of wall-clock time. That's a lot, and not suited for
entering deeper idle state on that CPU, isn't it?

Apart from that, the little CPUs are tiny in terms of silicon area
and are less leaky in WFI than big cores. Therefore, they don't need
aggressive entries into deeper idle state. At the same time, they
are often used for serving interrupts, where the latency is important
factor.

> while developing the patch?

We have tried different threshold values in terms of %, but for all CPUs
(at the same time) not per-cluster. The reason was to treat those CPUs
differently as described above.

Regards,
Lukasz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ