lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210610150324.22919-1-lukasz.luba@arm.com>
Date:   Thu, 10 Jun 2021 16:03:21 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     linux-kernel@...r.kernel.org
Cc:     linux-pm@...r.kernel.org, peterz@...radead.org, rjw@...ysocki.net,
        viresh.kumar@...aro.org, vincent.guittot@...aro.org,
        qperret@...gle.com, dietmar.eggemann@....com,
        vincent.donnefort@....com, lukasz.luba@....com,
        Beata.Michalska@....com, mingo@...hat.com, juri.lelli@...hat.com,
        rostedt@...dmis.org, segall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, thara.gopinath@...aro.org,
        amit.kachhap@...il.com, amitk@...nel.org, rui.zhang@...el.com,
        daniel.lezcano@...aro.org
Subject: [PATCH v3 0/3] Add allowed CPU capacity knowledge to EAS

Hi all,

The patch set v3 aims to add knowledge about reduced CPU capacity
into the Energy Model (EM) and Energy Aware Scheduler (EAS). Currently the
issue is that SchedUtil CPU frequency and EM frequency are not aligned,
when there is a CPU thermal capping. This causes an estimation error.
This patch set provides the information about allowed CPU capacity
into the EM (thanks to thermal pressure information). This improves the
energy estimation. More info about this mechanism can be found in the
patches comments.

There is a new patch 1/3 in this v3, addressing an issue triggered for
hotplugged out CPU. The offline CPUs don't have proper value stored by
thermal framework in their per-cpu thermal_pressure. Thus, the thermal
pressure geometric series machinery reads 'stale' value when the CPU
is back online. The patch fixes it, so all mechanisms like
load balance, not only EAS, would have more accurate CPU capacity
information for those 'returning online' CPUs. I've added also related
cpu cooling maintainers to the CC of this patch set.

Changelog:
v3:
- switched to 'raw' per-cpu thermal pressure instead of thermal pressure
  geometric series signal, since it more suited for purpose of
  this use case: predicting SchedUtil frequency (Vincent, Dietmar)
- added more comment in the patch 2/3 header for use case when thermal
  capping might be applied even the CPUs are not over-utilized
  (Dietmar)
- added ACK tag from Rafael for SchedUtil part
- added a fix patch for offline CPUs in cpufreq_cooling and per-cpu
  thermal_pressure missing update
v2 [2]:
- clamp the returned value from effective_cpu_util() and avoid irq
  util scaling issues (Quentin)
v1 is available at [1]

Regards,
Lukasz

[1] https://lore.kernel.org/linux-pm/20210602135609.10867-1-lukasz.luba@arm.com/
[2] https://lore.kernel.org/lkml/20210604080954.13915-1-lukasz.luba@arm.com/

Lukasz Luba (3):
  thermal: cpufreq_cooling: Update also offline CPUs per-cpu
    thermal_pressure
  sched/fair: Take thermal pressure into account while estimating energy
  sched/cpufreq: Consider reduced CPU capacity in energy calculation

 drivers/thermal/cpufreq_cooling.c |  2 +-
 include/linux/energy_model.h      | 16 +++++++++++++---
 include/linux/sched/cpufreq.h     |  2 +-
 kernel/sched/cpufreq_schedutil.c  |  1 +
 kernel/sched/fair.c               | 14 ++++++++++----
 5 files changed, 26 insertions(+), 9 deletions(-)

-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ