lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtB8f0RH4qToLrWS+HSZhm8pyUe42DijiXZqo+mQQPWetQ@mail.gmail.com>
Date:   Thu, 8 Dec 2022 09:37:47 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Lukasz Luba <lukasz.luba@....com>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        rafael@...nel.org, viresh.kumar@...aro.org,
        dietmar.eggemann@....com, saravanak@...gle.com,
        wusamuel@...gle.com, isaacmanjarres@...gle.com,
        kernel-team@...roid.com, juri.lelli@...hat.com,
        peterz@...radead.org, mingo@...hat.com, rostedt@...dmis.org,
        bsegall@...gle.com, mgorman@...e.de
Subject: Re: [PATCH v2 2/2] cpufreq: schedutil: Optimize operations with
 single max CPU capacity

On Wed, 7 Dec 2022 at 11:17, Lukasz Luba <lukasz.luba@....com> wrote:
>
> The max CPU capacity is the same for all CPUs sharing frequency domain
> and thus 'policy' object. There is a way to avoid heavy operations
> in a loop for each CPU by leveraging this knowledge. Thus, simplify
> the looping code in the sugov_next_freq_shared() and drop heavy
> multiplications. Instead, use simple max() to get the highest utilization
> from these CPUs. This is useful for platforms with many (4 or 6) little
> CPUs.
>
> The max CPU capacity must be fetched every time we are called, due to
> difficulties during the policy setup, where we are not able to get the
> normalized CPU capacity at the right time.
>
> The stored value in sugov_policy::max is also than used in
> sugov_iowait_apply() to calculate the right boost. Thus, that field is
> useful to have in that sugov_policy struct.
>
> Signed-off-by: Lukasz Luba <lukasz.luba@....com>
> ---
>  kernel/sched/cpufreq_schedutil.c | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index c19d6de67b7a..f9881f3d9488 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -158,10 +158,8 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
>
>  static void sugov_get_util(struct sugov_cpu *sg_cpu)
>  {
> -       struct sugov_policy *sg_policy = sg_cpu->sg_policy;
>         struct rq *rq = cpu_rq(sg_cpu->cpu);
>
> -       sg_policy->max = arch_scale_cpu_capacity(sg_cpu->cpu);
>         sg_cpu->bw_dl = cpu_bw_dl(rq);
>         sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu),
>                                           FREQUENCY_UTIL, NULL);
> @@ -317,6 +315,8 @@ static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
>  static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
>                                               u64 time, unsigned int flags)
>  {
> +       struct sugov_policy *sg_policy = sg_cpu->sg_policy;
> +
>         sugov_iowait_boost(sg_cpu, time, flags);
>         sg_cpu->last_update = time;
>
> @@ -325,6 +325,9 @@ static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
>         if (!sugov_should_update_freq(sg_cpu->sg_policy, time))
>                 return false;
>
> +       /* Fetch the latest CPU capcity to avoid stale data */
> +       sg_policy->max = arch_scale_cpu_capacity(sg_cpu->cpu);
> +
>         sugov_get_util(sg_cpu);
>         sugov_iowait_apply(sg_cpu, time);
>
> @@ -414,25 +417,22 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu, u64 time)
>  {
>         struct sugov_policy *sg_policy = sg_cpu->sg_policy;
>         struct cpufreq_policy *policy = sg_policy->policy;
> -       unsigned long util = 0, max = 1;
> +       unsigned long util = 0;
>         unsigned int j;
>
> +       /* Fetch the latest CPU capcity to avoid stale data */
> +       sg_policy->max = arch_scale_cpu_capacity(sg_cpu->cpu);
> +
>         for_each_cpu(j, policy->cpus) {
>                 struct sugov_cpu *j_sg_cpu = &per_cpu(sugov_cpu, j);
> -               unsigned long j_util, j_max;
>
>                 sugov_get_util(j_sg_cpu);
>                 sugov_iowait_apply(j_sg_cpu, time);
> -               j_util = j_sg_cpu->util;
> -               j_max = j_sg_cpu->max;
>
> -               if (j_util * max > j_max * util) {
> -                       util = j_util;
> -                       max = j_max;
> -               }

With the code removed above, max is only used in 2 places:
- sugov_iowait_apply
- map_util_freq

I wonder if it would be better to just call arch_scale_cpu_capacity()
in these 2 places instead of saving a copy in sg_policy and then
reading it twice.

arch_scaleu_cpu_capacity is already a per_cpu variable so accessing it
should be pretty cheap.

Thought ?

> +               util = max(j_sg_cpu->util, util);
>         }
>
> -       return get_next_freq(sg_policy, util, max);
> +       return get_next_freq(sg_policy, util, sg_policy->max);
>  }
>
>  static void
> --
> 2.17.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ