lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230113052141.2874296-1-wyes.karny@amd.com>
Date:   Fri, 13 Jan 2023 05:21:35 +0000
From:   Wyes Karny <wyes.karny@....com>
To:     Rafael J Wysocki <rafael@...nel.org>,
        Huang Rui <ray.huang@....com>,
        Jonathan Corbet <corbet@....net>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        <Mario.Limonciello@....com>, <Perry.Yuan@....com>,
        Ananth Narayan <ananth.narayan@....com>,
        <gautham.shenoy@....com>
CC:     <linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linux-pm@...r.kernel.org>, Bagas Sanjaya <bagasdotme@...il.com>,
        <santosh.shukla@....com>, Wyes Karny <wyes.karny@....com>
Subject: [PATCH v2 0/6] amd_pstate: Add guided autonomous mode support

>From ACPI spec[1] below 3 modes for CPPC can be defined:
1. Non autonomous: OS scaling governor specifies operating frequency/
   performance level through `Desired Performance` register and platform
follows that.
2. Guided autonomous: OS scaling governor specifies min and max
   frequencies/ performance levels through `Minimum Performance` and
`Maximum Performance` register, and platform can autonomously select an
operating frequency in this range.
3. Fully autonomous: OS only hints (via EPP) to platform for the required
   energy performance preference for the workload and platform autonomously
scales the frequency.

Currently (1) is supported by amd_pstate as passive mode, and (3) is
implemented by EPP support[2]. This change is to support (2).

In guided autonomous mode the min_perf is based on the input from the
scaling governor. For example, in case of schedutil this value depends
on the current utilization. And max_perf is set to max capacity.

To activate guided auto mode ``amd_pstate=guided`` command line
parameter has to be passed in the kernel.

Below are the results (normalized) of benchmarks with this patch:
System: Genoa 96C 192T
Kernel: v6.1-rc6 + patch
Scaling governor: schedutil

================ tbench  ================
tbench result comparison: (higher the better)
Here results are throughput (MB/s)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
    1	   1.00 (0.00 pct)	   1.16 (16.00 pct)	   2.20 (120.00 pct)
    2	   1.97 (0.00 pct)	   2.29 (16.24 pct)	   4.38 (122.33 pct)
    4	   3.95 (0.00 pct)	   4.51 (14.17 pct)	   8.50 (115.18 pct)
    8	   7.83 (0.00 pct)	   8.89 (13.53 pct)	  16.62 (112.26 pct)
   16	  15.28 (0.00 pct)	  16.81 (10.01 pct)	  31.02 (103.01 pct)
   32	  41.64 (0.00 pct)	  30.67 (-26.34 pct)	  55.63 (33.59 pct)
   64	  91.29 (0.00 pct)	  79.67 (-12.72 pct)	  91.74 (0.49 pct)
  128	 118.06 (0.00 pct)	 122.34 (3.62 pct)	 122.04 (3.37 pct)
  256	 260.47 (0.00 pct)	 264.31 (1.47 pct)	 264.49 (1.54 pct)
  512	 254.16 (0.00 pct)	 245.25 (-3.50 pct)	 245.50 (-3.40 pct)
tbench power comparison: (lower the better)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
    1	   1.00 (0.00 pct)	   1.00 (0.00 pct)	   1.15 (15.00 pct)
    2	   0.99 (0.00 pct)	   1.00 (1.01 pct)	   1.17 (18.18 pct)
    4	   1.01 (0.00 pct)	   1.02 (0.99 pct)	   1.24 (22.77 pct)
    8	   1.05 (0.00 pct)	   1.06 (0.95 pct)	   1.36 (29.52 pct)
   16	   1.15 (0.00 pct)	   1.13 (-1.73 pct)	   1.58 (37.39 pct)
   32	   1.71 (0.00 pct)	   1.30 (-23.97 pct)	   1.96 (14.61 pct)
   64	   2.35 (0.00 pct)	   2.15 (-8.51 pct)	   2.36 (0.42 pct)
  128	   2.77 (0.00 pct)	   2.77 (0.00 pct)	   2.78 (0.36 pct)
  256	   3.39 (0.00 pct)	   3.41 (0.58 pct)	   3.43 (1.17 pct)
  512	   3.42 (0.00 pct)	   3.40 (-0.58 pct)	   3.41 (-0.29 pct)

================ dbench  ================
dbench result comparison: (higher the better)
Here results are throughput (MB/s)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
    1	   1.00 (0.00 pct)	   0.96 (-4.00 pct)	   1.02 (2.00 pct)
    2	   1.89 (0.00 pct)	   1.90 (0.52 pct)	   1.91 (1.05 pct)
    4	   3.39 (0.00 pct)	   3.31 (-2.35 pct)	   3.38 (-0.29 pct)
    8	   5.56 (0.00 pct)	   5.46 (-1.79 pct)	   5.60 (0.71 pct)
   16	   7.25 (0.00 pct)	   7.90 (8.96 pct)	   8.29 (14.34 pct)
   32	  10.85 (0.00 pct)	  10.00 (-7.83 pct)	  10.40 (-4.14 pct)
   64	  12.30 (0.00 pct)	  11.94 (-2.92 pct)	  11.82 (-3.90 pct)
  128	  12.56 (0.00 pct)	  12.30 (-2.07 pct)	  12.98 (3.34 pct)
  256	   6.55 (0.00 pct)	   6.54 (-0.15 pct)	   7.38 (12.67 pct)
  512	   1.61 (0.00 pct)	   1.58 (-1.86 pct)	   1.95 (21.11 pct)
dbench power comparison: (lower the better)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
    1	   1.00 (0.00 pct)	   1.01 (1.00 pct)	   1.05 (5.00 pct)
    2	   1.07 (0.00 pct)	   1.07 (0.00 pct)	   1.09 (1.86 pct)
    4	   1.15 (0.00 pct)	   1.15 (0.00 pct)	   1.16 (0.86 pct)
    8	   1.26 (0.00 pct)	   1.26 (0.00 pct)	   1.27 (0.79 pct)
   16	   1.39 (0.00 pct)	   1.41 (1.43 pct)	   1.43 (2.87 pct)
   32	   1.60 (0.00 pct)	   1.56 (-2.50 pct)	   1.59 (-0.62 pct)
   64	   1.75 (0.00 pct)	   1.75 (0.00 pct)	   1.74 (-0.57 pct)
  128	   1.90 (0.00 pct)	   1.91 (0.52 pct)	   1.93 (1.57 pct)
  256	   1.76 (0.00 pct)	   1.77 (0.56 pct)	   1.85 (5.11 pct)
  512	   1.55 (0.00 pct)	   1.49 (-3.87 pct)	   1.73 (11.61 pct)

================ git-source  ================
git-source result comparison: (higher the better)
Here results are throughput (compilations per 1000 sec)
Threads 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
  192	   1.00 (0.00 pct)	   0.94 (-5.70 pct)	   1.00 (0.00 pct)
git-source power comparison: (lower the better)
Threads 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
  192	   1.00 (0.00 pct)	   1.03 (3.00 pct)	   1.02 (2.00 pct)

================ kernbench  ================
kernbench result comparison: (higher the better)
Here results are throughput (compilations per 1000 sec)
Load 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
32	   1.00 (0.00 pct)	   0.94 (-6.00 pct)	   1.02 (2.00 pct)
48	   1.24 (0.00 pct)	   1.16 (-6.45 pct)	   1.24 (0.00 pct)
64	   1.35 (0.00 pct)	   1.30 (-3.70 pct)	   1.39 (2.96 pct)
96	   1.42 (0.00 pct)	   1.28 (-9.85 pct)	   1.48 (4.22 pct)
128	   1.39 (0.00 pct)	   1.29 (-7.19 pct)	   1.41 (1.43 pct)
192	   1.32 (0.00 pct)	   1.18 (-10.60 pct)	   1.32 (0.00 pct)
256	   1.28 (0.00 pct)	   1.14 (-10.93 pct)	   1.29 (0.78 pct)
384	   1.28 (0.00 pct)	   1.13 (-11.71 pct)	   1.27 (-0.78 pct)
git-source power comparison: (lower the better)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
   32	   1.00 (0.00 pct)	   1.04 (4.00 pct)	   0.95 (-5.00 pct)
   48	   0.83 (0.00 pct)	   0.90 (8.43 pct)	   0.82 (-1.20 pct)
   64	   0.80 (0.00 pct)	   0.82 (2.50 pct)	   0.75 (-6.25 pct)
   96	   0.77 (0.00 pct)	   0.81 (5.19 pct)	   0.75 (-2.59 pct)
  128	   0.78 (0.00 pct)	   0.82 (5.12 pct)	   0.75 (-3.84 pct)
  192	   0.84 (0.00 pct)	   0.89 (5.95 pct)	   0.83 (-1.19 pct)
  256	   0.84 (0.00 pct)	   0.89 (5.95 pct)	   0.84 (0.00 pct)
  384	   0.84 (0.00 pct)	   0.90 (7.14 pct)	   0.84 (0.00 pct)

Note: this series is based on top of EPP v9 [3] series

Change log:
v1 -> v2:
- Fix issue with shared mem systems.
- Rebase on top of EPP series.

[1]: https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf
[2]: https://lore.kernel.org/lkml/20221110175847.3098728-1-Perry.Yuan@amd.com/
[3]: https://lore.kernel.org/linux-pm/20221225163442.2205660-1-perry.yuan@amd.com/

Wyes Karny (6):
  acpi: cppc: Add min and max perf reg writing support
  acpi: cppc: Add auto select register read/write support
  cpufreq: amd_pstate: Add guided autonomous mode
  Documentation: amd_pstate: Move amd_pstate param to alphabetical order
  cpufreq: amd_pstate: Add guided mode control support via sysfs
  Documentation: amd_pstate: Update amd_pstate status sysfs for guided

 .../admin-guide/kernel-parameters.txt         |  41 +++--
 Documentation/admin-guide/pm/amd-pstate.rst   |  29 ++-
 drivers/acpi/cppc_acpi.c                      | 113 +++++++++++-
 drivers/cpufreq/amd-pstate.c                  | 173 ++++++++++++++----
 include/acpi/cppc_acpi.h                      |  11 ++
 include/linux/amd-pstate.h                    |   2 +
 6 files changed, 297 insertions(+), 72 deletions(-)

-- 
2.34.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ