lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 Jul 2017 21:43:26 -0700
From:   Joel Fernandes <joelaf@...gle.com>
To:     Patrick Bellasi <patrick.bellasi@....com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Juri Lelli <juri.lelli@....com>,
        Andres Oportus <andresoportus@...gle.com>,
        Todd Kjos <tkjos@...roid.com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: [PATCH v2 2/6] cpufreq: schedutil: reset sg_cpus's flags at IDLE enter

On Tue, Jul 4, 2017 at 10:34 AM, Patrick Bellasi
<patrick.bellasi@....com> wrote:
> Currently, sg_cpu's flags are set to the value defined by the last call of
> the cpufreq_update_util()/cpufreq_update_this_cpu(); for RT/DL classes
> this corresponds to the SCHED_CPUFREQ_{RT/DL} flags always being set.
>
> When multiple CPU shares the same frequency domain it might happen that a
> CPU which executed a RT task, right before entering IDLE, has one of the
> SCHED_CPUFREQ_RT_DL flags set, permanently, until it exits IDLE.
>
> Although such an idle CPU is _going to be_ ignored by the
> sugov_next_freq_shared():
>   1. this kind of "useless RT requests" are ignored only if more then
>      TICK_NSEC have elapsed since the last update
>   2. we can still potentially trigger an already too late switch to
>      MAX, which starts also a new throttling interval
>   3. the internal state machine is not consistent with what the
>      scheduler knows, i.e. the CPU is now actually idle
>
> Thus, in sugov_next_freq_shared(), where utilisation and flags are
> aggregated across all the CPUs of a frequency domain, it can turn out
> that all the CPUs of that domain can run unnecessary at the maximum OPP
> until another event happens in the idle CPU, which eventually clear the
> SCHED_CPUFREQ_{RT/DL} flag, or the IDLE CPUs gets ignored after
> TICK_NSEC since the CPU entering IDLE.
>
> Such a behaviour can harm the energy efficiency of systems where RT
> workloads are not so frequent and other CPUs in the same frequency
> domain are running small utilisation workloads, which is a quite common
> scenario in mobile embedded systems.
>
> This patch proposes a solution which is aligned with the current principle
> to update the flags each time a scheduling event happens. The scheduling
> of the idle_task on a CPU is considered one of such meaningful events.
> That's why when the idle_task is selected for execution we poke the
> schedutil policy to reset the flags for that CPU.
>
> No frequency transitions are activated at that point, which is fair in
> case the RT workload should come back in the future. However, this still
> allows other CPUs in the same frequency domain to scale down the
> frequency in case that should be possible.
>
> Signed-off-by: Patrick Bellasi <patrick.bellasi@....com>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> Cc: Viresh Kumar <viresh.kumar@...aro.org>
> Cc: linux-kernel@...r.kernel.org
> Cc: linux-pm@...r.kernel.org
>
> ---
> Changes from v1:
> - added "unlikely()" around the statement (SteveR)
> ---
>  include/linux/sched/cpufreq.h    | 1 +
>  kernel/sched/cpufreq_schedutil.c | 7 +++++++
>  kernel/sched/idle_task.c         | 4 ++++
>  3 files changed, 12 insertions(+)
>
> diff --git a/include/linux/sched/cpufreq.h b/include/linux/sched/cpufreq.h
> index d2be2cc..36ac8d2 100644
> --- a/include/linux/sched/cpufreq.h
> +++ b/include/linux/sched/cpufreq.h
> @@ -10,6 +10,7 @@
>  #define SCHED_CPUFREQ_RT       (1U << 0)
>  #define SCHED_CPUFREQ_DL       (1U << 1)
>  #define SCHED_CPUFREQ_IOWAIT   (1U << 2)
> +#define SCHED_CPUFREQ_IDLE     (1U << 3)
>
>  #define SCHED_CPUFREQ_RT_DL    (SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index eaba6d6..004ae18 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -304,6 +304,12 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
>
>         sg_cpu->util = util;
>         sg_cpu->max = max;
> +
> +       /* CPU is entering IDLE, reset flags without triggering an update */
> +       if (unlikely(flags & SCHED_CPUFREQ_IDLE)) {
> +               sg_cpu->flags = 0;
> +               goto done;
> +       }

Instead of defining a new flag for idle, wouldn't another way be to
just clear the flag from the RT scheduling class with an extra call to
cpufreq_update_util with flags = 0 during dequeue_rt_entity? That
seems to me to be also the right place to clear the flag since the
flag is set in the corresponding class to begin with.

thanks,

-Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ