[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1608307905.26567.46.camel@suse.com>
Date: Fri, 18 Dec 2020 17:11:45 +0100
From: Giovanni Gherdovich <ggherdovich@...e.com>
To: "Rafael J. Wysocki" <rjw@...ysocki.net>,
Linux PM <linux-pm@...r.kernel.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>,
Doug Smythies <dsmythies@...us.net>
Subject: Re: [PATCH v2 0/3] cpufreq: Allow drivers to receive more
information from the governor
On Mon, 2020-12-14 at 21:01 +0100, Rafael J. Wysocki wrote:
> Hi,
>
> The timing of this is not perfect (sorry about that), but here's a refresh
> of this series.
>
> The majority of the previous cover letter still applies:
> [...]
Hello,
the series is tested using
-> tbench (packets processing with loopback networking, measures throughput)
-> dbench (filesystem operations, measures average latency)
-> kernbench (kernel compilation, elapsed time)
-> and gitsource (long-running shell script, elapsed time)
These are chosen because none of them is bound by compute and all are
sensitive to freq scaling decisions. The machines are a Cascade Lake based
server, a client Skylake and a Coffee Lake laptop.
What's being compared:
sugov-HWP.desired : the present series; intel_pstate=passive, governor=schedutil
sugov-HWP.min : mainline; intel_pstate=passive, governor=schedutil
powersave-HWP : mainline; intel_pstate=active, governor=powersave
perfgov-HWP : mainline; intel_pstate=active, governor=performance
sugov-no-HWP : HWP disabled; intel_pstate=passive, governor=schedutil
Dbench and Kernbench have neutral results, but Tbench has sugov-HWP.desired
lose in both performance and performance-per-watt, while Gitsource show the
series as faster in raw performance but again worse than the competition in
efficiency.
1. SUMMARY BY BENCHMARK
1.1. TBENCH
1.2. DBENCH
1.3. KERNBENCH
1.4. GITSOURCE
2. SUMMARY BY USER PROFILE
2.1. PERFORMANCE USER: what if I switch pergov -> schedutil?
2.2. DEFAULT USER: what if I switch powersave -> schedutil?
2.3. DEVELOPER: what if I switch sugov-HWP.min -> sugov-HWP.desired?
3. RESULTS TABLES
PERFORMANCE RATIOS
PERFORMANCE-PER-WATT RATIOS
1. SUMMARY BY BENCHMARK
~~~~~~~~~~~~~~~~~~~~~~~
Tbench: sugov-HWP.desired is the worst performance on all three
machines. sugov-HWP.min is between 20% and 90% better. The baseline
sugov-HWP.desired offers a lower throughput, but does it increase
efficiency? It actually doesn't: on two out of three machines the
incumbent code (current sugov, or intel_pstate=active) has 10% to 35%
better efficiency. In other word, the status quo is both faster and more
efficient than the proposed series on this benchmark.
The absolute power consumption is lower, but the delivered performance is
"even more lower", and that's why performance-per-watt shows a net loss.
Dbench: generally neutral, in both performance and efficiency. Powersave is
occasionally behind the pack in performance, 5% to 15%. A 15% performance
loss on the Coffe Lake is compensated by an 80% improved efficiency. To be
noted that on the same Coffee Lake sugov-no-HWP is 20% ahead of the pack
in efficiency.
Kernbench: neutral, in both performance and efficiency. powersave looses 14%
to the pack in performance on the Cascade Lake.
Gitsource: this test show the most compelling case against the
sugov-HWP.desired series: on the Cascade Lake sugov-HWP.desired is 10%
faster than sugov-HWP.min (it was expected to be slower!) and 35% less
efficient (we expected more performance-per-watt, not less).
2. SUMMARY BY USER PROFILE
~~~~~~~~~~~~~~~~~~~~~~~~~~
If I was a perfgov-HWP user, I would be 20%-90% faster than with other governors
on tbench an gitsource. This speed gap comes with an unexpected efficiency
bonus on both test. Since dbench and kernbench have a flat profile across the
board, there is no incentive to try another governor.
If I was a powersave-HWP user, I'd be the slower of the bunch. The lost
performance is not, in general, balanced by better efficiency. This only
happens on Coffee Lake, which is a CPU for the mobile market and possibly HWP
has efficiency-oriented tuning there. Any flavor of schedutil would be an
improvement.
>From a developer perspective, the obstacles to move from HWP.min to
HWP.desired are tbench, where HWP.desired is worse than having no HWP support
at all, and gitsource, where HWP.desired has the opposite properties than
those advertised (it's actually faster but less efficient).
3. RESULTS TABLES
~~~~~~~~~~~~~~~~~
Tilde (~) means the result is the same as baseline (or, the ratio is close to 1).
The double asterisk (**) is a visual aid and means the result is better than
baseline (higher or lower depending on the case).
| 80x_CASCADELAKE_NUMA: Intel Cascade Lake, 40 cores / 80 threads, NUMA, SATA SSD storage
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| sugov-HWP.des sugov-HWP.min powersave-HWP perfgov-HWP sugov-no-HWP better if
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| PERFORMANCE RATIOS
| tbench 1.00 1.89** 1.88** 1.89** 1.17** higher
| dbench 1.00 ~ 1.06 ~ ~ lower
| kernbench 1.00 ~ 1.14 ~ ~ lower
| gitsource 1.00 1.11 2.70 0.80** ~ lower
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| PERFORMANCE-PER-WATT RATIOS
| tbench 1.00 1.36** 1.38** 1.33** 1.04** higher
| dbench 1.00 ~ ~ ~ ~ higher
| kernbench 1.00 ~ ~ ~ ~ higher
| gitsource 1.00 1.36** 0.63 1.22** 1.02** higher
| 8x_COFFEELAKE_UMA: Intel Coffee Lake, 4 cores / 8 threads, UMA, NVMe SSD storage
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| sugov-HWP.des sugov-HWP.min powersave-HWP perfgov-HWP sugov-no-HWP better if
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| PERFORMANCE RATIOS
| tbench 1.00 1.27** 1.30** 1.30** 1.31** higher
| dbench 1.00 ~ 1.15 ~ ~ lower
| kernbench 1.00 ~ ~ ~ ~ lower
| gitsource 1.00 ~ 2.09 ~ ~ lower
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| PERFORMANCE-PER-WATT RATIOS
| tbench 1.00 ~ ~ ~ ~ higher
| dbench 1.00 ~ 1.82** ~ 1.22** higher
| kernbench 1.00 ~ ~ ~ ~ higher
| gitsource 1.00 ~ 1.56** ~ 1.17** higher
| 8x_SKYLAKE_UMA: Intel Skylake (client), 4 cores / 8 threads, UMA, SATA SSD storage
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| sugov-HWP.des sugov-HWP.min powersave-HWP perfgov-HWP sugov-no-HWP better if
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| PERFORMANCE RATIOS
| tbench 1.00 1.21** 1.22** 1.20** 1.06** higher
| dbench 1.00 ~ ~ ~ ~ lower
| kernbench 1.00 ~ ~ ~ ~ lower
| gitsource 1.00 ~ 1.71 0.96** ~ lower
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| PERFORMANCE-PER-WATT RATIOS
| tbench 1.00 1.11** 1.12** 1.10** 1.03** higher
| dbench 1.00 ~ ~ ~ ~ higher
| kernbench 1.00 ~ ~ ~ ~ higher
| gitsource 1.00 ~ 0.75 ~ ~ higher
Giovanni
Powered by blists - more mailing lists