[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230113052141.2874296-1-wyes.karny@amd.com>
Date: Fri, 13 Jan 2023 05:21:35 +0000
From: Wyes Karny <wyes.karny@....com>
To: Rafael J Wysocki <rafael@...nel.org>,
Huang Rui <ray.huang@....com>,
Jonathan Corbet <corbet@....net>,
Viresh Kumar <viresh.kumar@...aro.org>,
<Mario.Limonciello@....com>, <Perry.Yuan@....com>,
Ananth Narayan <ananth.narayan@....com>,
<gautham.shenoy@....com>
CC: <linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-pm@...r.kernel.org>, Bagas Sanjaya <bagasdotme@...il.com>,
<santosh.shukla@....com>, Wyes Karny <wyes.karny@....com>
Subject: [PATCH v2 0/6] amd_pstate: Add guided autonomous mode support
>From ACPI spec[1] below 3 modes for CPPC can be defined:
1. Non autonomous: OS scaling governor specifies operating frequency/
performance level through `Desired Performance` register and platform
follows that.
2. Guided autonomous: OS scaling governor specifies min and max
frequencies/ performance levels through `Minimum Performance` and
`Maximum Performance` register, and platform can autonomously select an
operating frequency in this range.
3. Fully autonomous: OS only hints (via EPP) to platform for the required
energy performance preference for the workload and platform autonomously
scales the frequency.
Currently (1) is supported by amd_pstate as passive mode, and (3) is
implemented by EPP support[2]. This change is to support (2).
In guided autonomous mode the min_perf is based on the input from the
scaling governor. For example, in case of schedutil this value depends
on the current utilization. And max_perf is set to max capacity.
To activate guided auto mode ``amd_pstate=guided`` command line
parameter has to be passed in the kernel.
Below are the results (normalized) of benchmarks with this patch:
System: Genoa 96C 192T
Kernel: v6.1-rc6 + patch
Scaling governor: schedutil
================ tbench ================
tbench result comparison: (higher the better)
Here results are throughput (MB/s)
Clients acpi-cpufreq amd_pst+passive amd_pst+guided
1 1.00 (0.00 pct) 1.16 (16.00 pct) 2.20 (120.00 pct)
2 1.97 (0.00 pct) 2.29 (16.24 pct) 4.38 (122.33 pct)
4 3.95 (0.00 pct) 4.51 (14.17 pct) 8.50 (115.18 pct)
8 7.83 (0.00 pct) 8.89 (13.53 pct) 16.62 (112.26 pct)
16 15.28 (0.00 pct) 16.81 (10.01 pct) 31.02 (103.01 pct)
32 41.64 (0.00 pct) 30.67 (-26.34 pct) 55.63 (33.59 pct)
64 91.29 (0.00 pct) 79.67 (-12.72 pct) 91.74 (0.49 pct)
128 118.06 (0.00 pct) 122.34 (3.62 pct) 122.04 (3.37 pct)
256 260.47 (0.00 pct) 264.31 (1.47 pct) 264.49 (1.54 pct)
512 254.16 (0.00 pct) 245.25 (-3.50 pct) 245.50 (-3.40 pct)
tbench power comparison: (lower the better)
Clients acpi-cpufreq amd_pst+passive amd_pst+guided
1 1.00 (0.00 pct) 1.00 (0.00 pct) 1.15 (15.00 pct)
2 0.99 (0.00 pct) 1.00 (1.01 pct) 1.17 (18.18 pct)
4 1.01 (0.00 pct) 1.02 (0.99 pct) 1.24 (22.77 pct)
8 1.05 (0.00 pct) 1.06 (0.95 pct) 1.36 (29.52 pct)
16 1.15 (0.00 pct) 1.13 (-1.73 pct) 1.58 (37.39 pct)
32 1.71 (0.00 pct) 1.30 (-23.97 pct) 1.96 (14.61 pct)
64 2.35 (0.00 pct) 2.15 (-8.51 pct) 2.36 (0.42 pct)
128 2.77 (0.00 pct) 2.77 (0.00 pct) 2.78 (0.36 pct)
256 3.39 (0.00 pct) 3.41 (0.58 pct) 3.43 (1.17 pct)
512 3.42 (0.00 pct) 3.40 (-0.58 pct) 3.41 (-0.29 pct)
================ dbench ================
dbench result comparison: (higher the better)
Here results are throughput (MB/s)
Clients acpi-cpufreq amd_pst+passive amd_pst+guided
1 1.00 (0.00 pct) 0.96 (-4.00 pct) 1.02 (2.00 pct)
2 1.89 (0.00 pct) 1.90 (0.52 pct) 1.91 (1.05 pct)
4 3.39 (0.00 pct) 3.31 (-2.35 pct) 3.38 (-0.29 pct)
8 5.56 (0.00 pct) 5.46 (-1.79 pct) 5.60 (0.71 pct)
16 7.25 (0.00 pct) 7.90 (8.96 pct) 8.29 (14.34 pct)
32 10.85 (0.00 pct) 10.00 (-7.83 pct) 10.40 (-4.14 pct)
64 12.30 (0.00 pct) 11.94 (-2.92 pct) 11.82 (-3.90 pct)
128 12.56 (0.00 pct) 12.30 (-2.07 pct) 12.98 (3.34 pct)
256 6.55 (0.00 pct) 6.54 (-0.15 pct) 7.38 (12.67 pct)
512 1.61 (0.00 pct) 1.58 (-1.86 pct) 1.95 (21.11 pct)
dbench power comparison: (lower the better)
Clients acpi-cpufreq amd_pst+passive amd_pst+guided
1 1.00 (0.00 pct) 1.01 (1.00 pct) 1.05 (5.00 pct)
2 1.07 (0.00 pct) 1.07 (0.00 pct) 1.09 (1.86 pct)
4 1.15 (0.00 pct) 1.15 (0.00 pct) 1.16 (0.86 pct)
8 1.26 (0.00 pct) 1.26 (0.00 pct) 1.27 (0.79 pct)
16 1.39 (0.00 pct) 1.41 (1.43 pct) 1.43 (2.87 pct)
32 1.60 (0.00 pct) 1.56 (-2.50 pct) 1.59 (-0.62 pct)
64 1.75 (0.00 pct) 1.75 (0.00 pct) 1.74 (-0.57 pct)
128 1.90 (0.00 pct) 1.91 (0.52 pct) 1.93 (1.57 pct)
256 1.76 (0.00 pct) 1.77 (0.56 pct) 1.85 (5.11 pct)
512 1.55 (0.00 pct) 1.49 (-3.87 pct) 1.73 (11.61 pct)
================ git-source ================
git-source result comparison: (higher the better)
Here results are throughput (compilations per 1000 sec)
Threads acpi-cpufreq amd_pst+passive amd_pst+guided
192 1.00 (0.00 pct) 0.94 (-5.70 pct) 1.00 (0.00 pct)
git-source power comparison: (lower the better)
Threads acpi-cpufreq amd_pst+passive amd_pst+guided
192 1.00 (0.00 pct) 1.03 (3.00 pct) 1.02 (2.00 pct)
================ kernbench ================
kernbench result comparison: (higher the better)
Here results are throughput (compilations per 1000 sec)
Load acpi-cpufreq amd_pst+passive amd_pst+guided
32 1.00 (0.00 pct) 0.94 (-6.00 pct) 1.02 (2.00 pct)
48 1.24 (0.00 pct) 1.16 (-6.45 pct) 1.24 (0.00 pct)
64 1.35 (0.00 pct) 1.30 (-3.70 pct) 1.39 (2.96 pct)
96 1.42 (0.00 pct) 1.28 (-9.85 pct) 1.48 (4.22 pct)
128 1.39 (0.00 pct) 1.29 (-7.19 pct) 1.41 (1.43 pct)
192 1.32 (0.00 pct) 1.18 (-10.60 pct) 1.32 (0.00 pct)
256 1.28 (0.00 pct) 1.14 (-10.93 pct) 1.29 (0.78 pct)
384 1.28 (0.00 pct) 1.13 (-11.71 pct) 1.27 (-0.78 pct)
git-source power comparison: (lower the better)
Clients acpi-cpufreq amd_pst+passive amd_pst+guided
32 1.00 (0.00 pct) 1.04 (4.00 pct) 0.95 (-5.00 pct)
48 0.83 (0.00 pct) 0.90 (8.43 pct) 0.82 (-1.20 pct)
64 0.80 (0.00 pct) 0.82 (2.50 pct) 0.75 (-6.25 pct)
96 0.77 (0.00 pct) 0.81 (5.19 pct) 0.75 (-2.59 pct)
128 0.78 (0.00 pct) 0.82 (5.12 pct) 0.75 (-3.84 pct)
192 0.84 (0.00 pct) 0.89 (5.95 pct) 0.83 (-1.19 pct)
256 0.84 (0.00 pct) 0.89 (5.95 pct) 0.84 (0.00 pct)
384 0.84 (0.00 pct) 0.90 (7.14 pct) 0.84 (0.00 pct)
Note: this series is based on top of EPP v9 [3] series
Change log:
v1 -> v2:
- Fix issue with shared mem systems.
- Rebase on top of EPP series.
[1]: https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf
[2]: https://lore.kernel.org/lkml/20221110175847.3098728-1-Perry.Yuan@amd.com/
[3]: https://lore.kernel.org/linux-pm/20221225163442.2205660-1-perry.yuan@amd.com/
Wyes Karny (6):
acpi: cppc: Add min and max perf reg writing support
acpi: cppc: Add auto select register read/write support
cpufreq: amd_pstate: Add guided autonomous mode
Documentation: amd_pstate: Move amd_pstate param to alphabetical order
cpufreq: amd_pstate: Add guided mode control support via sysfs
Documentation: amd_pstate: Update amd_pstate status sysfs for guided
.../admin-guide/kernel-parameters.txt | 41 +++--
Documentation/admin-guide/pm/amd-pstate.rst | 29 ++-
drivers/acpi/cppc_acpi.c | 113 +++++++++++-
drivers/cpufreq/amd-pstate.c | 173 ++++++++++++++----
include/acpi/cppc_acpi.h | 11 ++
include/linux/amd-pstate.h | 2 +
6 files changed, 297 insertions(+), 72 deletions(-)
--
2.34.1
Powered by blists - more mailing lists