lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAYoRsV2_gmbd84GCfAZk2ueRDPXczNgVAaqX7QbLf2Ljp=fBg@mail.gmail.com>
Date:   Sat, 17 Sep 2022 15:58:47 -0700
From:   Doug Smythies <dsmythies@...us.net>
To:     Kajetan Puchalski <kajetan.puchalski@....com>
Cc:     rafael@...nel.org, daniel.lezcano@...aro.org, lukasz.luba@....com,
        Dietmar.Eggemann@....com, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Doug Smythies <dsmythies@...us.net>
Subject: Re: [RFC PATCH 1/1] cpuidle: teo: Add optional util-awareness

On Thu, Sep 15, 2022 at 9:45 AM Kajetan Puchalski
<kajetan.puchalski@....com> wrote:
>
> Modern interactive systems, such as recent Android phones, tend to have
> power efficient shallow idle states. Selecting deeper idle states on a
> device while a latency-sensitive workload is running can adversely impact
> performance due to increased latency. Additionally, if the CPU wakes up
> from a deeper sleep before its target residency as is often the case, it
> results in a waste of energy on top of that.
>
> This patch extends the TEO governor with an optional mechanism adding
> util-awareness, effectively providing a way for the governor to switch
> between only selecting the shallowest idle state when the cpu is being
> utilized over a certain threshold and trying to select the deepest possible
> state using TEO's metrics when the cpu is not being utilized. This is now
> possible since the CPU utilization is exported from the scheduler with the
> sched_cpu_util function and already used e.g. in the thermal governor IPA.
>
> This can provide drastically decreased latency and performance benefits in
> certain types of mobile workloads that are sensitive to latency,
> such as Geekbench 5.
>
> Signed-off-by: Kajetan Puchalski <kajetan.puchalski@....com>
> ---
>  drivers/cpuidle/Kconfig         | 12 +++++
>  drivers/cpuidle/governors/teo.c | 86 +++++++++++++++++++++++++++++++++
>  2 files changed, 98 insertions(+)
>
> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index ff71dd662880..6b66ee88a2b2 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -33,6 +33,18 @@ config CPU_IDLE_GOV_TEO
>           Some workloads benefit from using it and it generally should be safe
>           to use.  Say Y here if you are not happy with the alternatives.
>
> +config CPU_IDLE_GOV_TEO_UTIL_AWARE
> +       bool "Util-awareness mechanism for TEO"
> +       depends on CPU_IDLE_GOV_TEO
> +       help
> +         Util-awareness mechanism for the TEO governor. With this enabled,
> +         the governor will choose the shallowest available state when the
> +         CPU's average util is above a certain threshold and default to
> +         using the metrics-based approach when it's not.
> +
> +         Some latency-sensitive workloads on interactive devices can benefit
> +         from using it.
> +
>  config CPU_IDLE_GOV_HALTPOLL
>         bool "Haltpoll governor (for virtualized systems)"
>         depends on KVM_GUEST
> diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c
> index d9262db79cae..fd5b2eb750be 100644
> --- a/drivers/cpuidle/governors/teo.c
> +++ b/drivers/cpuidle/governors/teo.c
> @@ -2,8 +2,13 @@
>  /*
>   * Timer events oriented CPU idle governor
>   *
> + * TEO governor:
>   * Copyright (C) 2018 - 2021 Intel Corporation
>   * Author: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> + *
> + * Util-awareness mechanism:
> + * Copyright (C) 2022 Arm Ltd.
> + * Author: Kajetan Puchalski <kajetan.puchalski@....com>
>   */
>
>  /**
> @@ -99,14 +104,48 @@
>   *      select the given idle state instead of the candidate one.
>   *
>   * 3. By default, select the candidate state.
> + *
> + * Util-awareness mechanism:
> + *
> + * The idea behind the util-awareness extension is that there are two distinct
> + * scenarios for the CPU which should result in two different approaches to idle
> + * state selection - utilized and not utilized.
> + *
> + * In this case, 'utilized' means that the average runqueue util of the CPU is
> + * above a certain threshold.
> + *
> + * When the CPU is utilized while going into idle, more likely than not it will
> + * be woken up to do more work soon and so the shallowest idle state should be
> + * selected to minimise latency and maximise performance. When the CPU is not
> + * being utilized, the usual metrics-based approach to selecting the deepest
> + * available idle state should be preferred to take advantage of the power
> + * saving.
> + *
> + * In order to achieve this, the governor uses a utilization threshold.
> + * The threshold is computed per-cpu as a percentage of the CPU's capacity
> + * by bit shifting the capacity value. Based on testing, the shift of 6 (~1.56%)
> + * seems to be getting the best results.
> + *
> + * Before selecting the next idle state, the governor compares the current CPU
> + * util to the precomputed util threhsold. If it's below, it defaults to the

threshold

> + * TEO metrics mechanism. If it's above, it simply selects the shallowest
> + * enabled idle state.
>   */
>
>  #include <linux/cpuidle.h>
>  #include <linux/jiffies.h>
>  #include <linux/kernel.h>
> +#include <linux/sched.h>

I think it also needs this line:

+#include <linux/sched/topology.h>

At least for me, it didn't compile without it.

>  #include <linux/sched/clock.h>
>  #include <linux/tick.h>
>
> +/*
> + * The number of bits to shift the cpu's capacity by in order to determine
> + * the utilized threshold
> + */
> +#define UTIL_THRESHOLD_SHIFT 6
> +
> +
>  /*
>   * The PULSE value is added to metrics when they grow and the DECAY_SHIFT value
>   * is used for decreasing metrics on a regular basis.
> @@ -140,6 +179,8 @@ struct teo_bin {
>   * @total: Grand total of the "intercepts" and "hits" mertics for all bins.

metrics

>   * @next_recent_idx: Index of the next @recent_idx entry to update.
>   * @recent_idx: Indices of bins corresponding to recent "intercepts".
> + * @util_threshold: Threshold above which the CPU is considered utilized
> + * @utilized: Whether the last sleep on the CPU happened while utilized
>   */
>  struct teo_cpu {
>         s64 time_span_ns;
> @@ -148,10 +189,28 @@ struct teo_cpu {
>         unsigned int total;
>         int next_recent_idx;
>         int recent_idx[NR_RECENT];
> +#ifdef CONFIG_CPU_IDLE_GOV_TEO_UTIL_AWARE
> +       unsigned long util_threshold;
> +       bool utilized;
> +#endif
>  };
>
>  static DEFINE_PER_CPU(struct teo_cpu, teo_cpus);
>
> +#ifdef CONFIG_CPU_IDLE_GOV_TEO_UTIL_AWARE
> +/**
> + * teo_get_util - Update the CPU utilized status
> + * @dev: Target CPU
> + * @cpu_data: Governor CPU data for the target CPU
> + */
> +static void teo_get_util(struct cpuidle_device *dev, struct teo_cpu *cpu_data)
> +{
> +       unsigned long util = sched_cpu_util(dev->cpu);
> +
> +       cpu_data->utilized = util > cpu_data->util_threshold;
> +}
> +#endif
> +
>  /**
>   * teo_update - Update CPU metrics after wakeup.
>   * @drv: cpuidle driver containing state data.
> @@ -301,7 +360,13 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>         int i;
>
>         if (dev->last_state_idx >= 0) {
> +#ifdef CONFIG_CPU_IDLE_GOV_TEO_UTIL_AWARE
> +               /* don't update metrics if the cpu was utilized during the last sleep */
> +               if (!cpu_data->utilized)
> +                       teo_update(drv, dev);
> +#else
>                 teo_update(drv, dev);
> +#endif
>                 dev->last_state_idx = -1;
>         }
>
> @@ -321,6 +386,21 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>                         goto end;
>         }
>
> +#ifdef CONFIG_CPU_IDLE_GOV_TEO_UTIL_AWARE
> +       teo_get_util(dev, cpu_data);
> +       /* if the cpu is being utilized, choose the shallowest state and exit */
> +       if (cpu_data->utilized) {
> +               for (i = 0; i < drv->state_count; ++i) {
> +                       if (dev->states_usage[i].disable)
> +                               continue;
> +                       break;
> +               }
> +
> +               idx = i;
> +               goto end;
> +       }
> +#endif
> +
>         /*
>          * Find the deepest idle state whose target residency does not exceed
>          * the current sleep length and the deepest idle state not deeper than
> @@ -508,9 +588,15 @@ static int teo_enable_device(struct cpuidle_driver *drv,
>                              struct cpuidle_device *dev)
>  {
>         struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu);
> +#ifdef CONFIG_CPU_IDLE_GOV_TEO_UTIL_AWARE
> +       unsigned long max_capacity = arch_scale_cpu_capacity(dev->cpu);
> +#endif
>         int i;
>
>         memset(cpu_data, 0, sizeof(*cpu_data));
> +#ifdef CONFIG_CPU_IDLE_GOV_TEO_UTIL_AWARE
> +       cpu_data->util_threshold = max_capacity >> UTIL_THRESHOLD_SHIFT;
> +#endif
>
>         for (i = 0; i < NR_RECENT; i++)
>                 cpu_data->recent_idx[i] = -1;
> --
> 2.37.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ