linux-kernel - [PATCH v3 0/2] sched: Consider CPU contention in frequency, EAS max util & load-balance busiest CPU selection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230515115735.296329-1-dietmar.eggemann@arm.com>
Date:   Mon, 15 May 2023 13:57:33 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Qais Yousef <qyousef@...alina.io>,
        Kajetan Puchalski <kajetan.puchalski@....com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Vincent Donnefort <vdonnefort@...gle.com>,
        Quentin Perret <qperret@...gle.com>,
        Abhijeet Dharmapurikar <adharmap@...cinc.com>,
        linux-kernel@...r.kernel.org
Subject: [PATCH v3 0/2] sched: Consider CPU contention in frequency, EAS max util & load-balance busiest CPU selection

This is the implementation of the idea to factor in CPU runnable_avg
into the CPU utilization getter functions (so called 'runnable
boosting') as a way to consider CPU contention for:

  (a) CPU frequency
  (b) EAS' max util and
  (c) 'migrate_util' type load-balance busiest CPU selection.

Tests:

for (a) and (b):

Testcase is Jankbench (all subtests, 10 iterations) on Pixel6 (Android
12) with mainline v5.18 kernel and forward ported task scheduler
patches.

Uclamp has been deactivated so that the Android Dynamic Performance
Framework (ADPF) 'CPU performance hints' feature (Userspace task
boosting via uclamp_min) does not interfere.

Max_frame_duration:
+-----------------+------------+
|     kernel      | value [ms] |
+-----------------+------------+
|      base       | 163.061513 |
|    runnable     | 161.991705 |
+-----------------+------------+

Mean_frame_duration:
+-----------------+------------+----------+
|     kernel      | value [ms] | diff [%] |
+-----------------+------------+----------+
|      base       |    18.0    |    0.0   |
|    runnable     |    12.7    |  -29.43  |
+-----------------+------------+----------+

Jank percentage (Jank deadline 16ms):
+-----------------+------------+----------+
|     kernel      | value [%]  | diff [%] |
+-----------------+------------+----------+
|      base       |     3.6    |    0.0   |
|    runnable     |     1.0    |  -68.86  |
+-----------------+------------+----------+

Power usage [mW] (total - all CPUs):
+-----------------+------------+----------+
|     kernel      | value [mW] | diff [%] |
+-----------------+------------+----------+
|      base       |    129.5   |    0.0   |
|    runnable     |    134.3   |   3.71*  |
+-----------------+------------+----------+

* Power usage went up from 129.3 (-0.15%) in v1 to 134.3 (3.71%) whereas
all the other benchmark numbers stayed roughly the same. This is
probably because of using 'runnable boosting' for EAS max util now as
well and tasks more often end up running on non-little CPUs because of
that.

for (c):

Testcase is 'perf bench sched messaging' on Arm64 Ampere Altra with 160
CPUs (sched domains = {MC, DIE, NUMA}) which shows some small
improvement:

perf stat --null --repeat 10 -- perf bench sched messaging -t -g 1 -l 2000

0.4869 +- 0.0173 seconds time elapsed (+- 3.55%) ->
0.4377 +- 0.0147 seconds time elapsed (+- 3.36%)

Chen Yu tested v1** with schbench, hackbench, netperf and tbench on an
Intel Sapphire Rapids with 2x56C/112T = 224 CPUs which showed no obvious
difference and some small improvements on tbench:

https://lkml.kernel.org/r/ZFSr4Adtx1ZI8hoc@chenyu5-mobl1

** The implementation for (c) hasn't changed in v2.

v1 -> v2:

(1) Refactor CPU utilization getter functions, let cpu_util_cfs() call
    cpu_util_next() (now cpu_util()).

(2) Consider CPU contention in EAS (find_energy_efficient_cpu() ->
    eenv_pd_max_util()) next to schedutil (sugov_get_util()) as well so
    that EAS' and schedutil's views on CPU frequency selection are in
    sync.

(3) Move 'util_avg = max(util_avg, runnable_avg)' from
    cpu_boosted_util_cfs() to cpu_util_next() (now cpu_util()) so that
    EAS can use it too.

(4) Rework patch header.

(5) Add test results (JankbenchX on Pixel6 to test changes in schedutil
    and EAS) and 'perf bench sched messaging' on Arm64 Ampere Altra for
    CFS load-balance (find_busiest_queue()).

v2 -> v3:

(1) Move function header from cpu_util_cfs() to cpu_util() and add a
    paragraph about 'runnable boosting'.

(2) Create cpu_util_cfs_boost() and call it for sites which want to use
    'runnable boosting'.

(3) Use regular 'if (boost)' in cpu_util().

Dietmar Eggemann (2):
  sched/fair: Refactor CPU utilization functions
  sched/fair, cpufreq: Introduce 'runnable boosting'

 kernel/sched/cpufreq_schedutil.c |  3 +-
 kernel/sched/fair.c              | 87 ++++++++++++++++++++++++++------
 kernel/sched/sched.h             | 48 +-----------------
 3 files changed, 76 insertions(+), 62 deletions(-)

-- 
2.25.1