[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <87lgsh3hbn.fsf@yhuang-dev.intel.com>
Date: Tue, 07 Mar 2017 11:18:36 +0800
From: kernel test robot <ying.huang@...ux.intel.com>
TO: Peter Zijlstra <peterz@...radead.org>
CC: Ingo Molnar <mingo@...nel.org>, kitsunyan <kitsunyan@...ox.ru>,
Chris Mason <clm@...com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Mike Galbraith <efault@....de>,
Mike Galbraith <umgwanakikbuti@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Stephen Rothwell <sfr@...b.auug.org.au>, lkp@...org
Subject: [lkp-robot] [sched/fair] 4c77b18cf8: hackbench.throughput -14.4%
regression
Greeting,
FYI, we noticed a -14.4% regression of hackbench.throughput due to commit:
commit: 4c77b18cf8b7ab37c7d5737b4609010d2ceec5f0 ("sched/fair: Make select_idle_cpu() more aggressive")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: hackbench
on test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 64G memory
with following parameters:
nr_threads: 50%
mode: process
ipc: pipe
cpufreq_governor: performance
test-description: Hackbench is both a benchmark and a stress test for the Linux kernel scheduler.
test-url: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/hackbench.c
In addition to that, the commit also has significant impact on the following tests:
+------------------+-----------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_tps -33.8% regression |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory |
| test parameters | cluster=cs-localhost |
| | cpufreq_governor=performance |
| | ip=ipv4 |
| | nr_threads=200% |
| | runtime=300s |
| | test=SCTP_RR |
+------------------+-----------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_tps -50.8% regression |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory |
| test parameters | cluster=cs-localhost |
| | cpufreq_governor=performance |
| | ip=ipv4 |
| | nr_threads=200% |
| | runtime=300s |
| | test=TCP_RR |
+------------------+-----------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_Mbps -8.7% regression |
| test machine | 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory |
| test parameters | cluster=cs-localhost |
| | cpufreq_governor=performance |
| | ip=ipv4 |
| | nr_threads=200% |
| | runtime=300s |
| | send_size=10K |
| | test=SCTP_STREAM_MANY |
+------------------+-----------------------------------------------------------------------+
| testcase: change | hackbench: hackbench.throughput 12.1% improvement |
| test machine | 8 threads Ivy Bridge with 16G memory |
| test parameters | cpufreq_governor=performance |
| | ipc=pipe |
| | mode=process |
| | nr_threads=50% |
+------------------+-----------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_Mbps -2.5% regression |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory |
| test parameters | cluster=cs-localhost |
| | cpufreq_governor=performance |
| | ip=ipv4 |
| | nr_threads=200% |
| | runtime=300s |
| | send_size=10K |
| | test=SCTP_STREAM_MANY |
+------------------+-----------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: hackbench/50%-process-pipe-performance/ivb42
4977ab6e92e267af 4c77b18cf8b7ab37c7d5737b46
---------------- --------------------------
179106 -14% 153395 hackbench.throughput
5.036e+08 21% 6.113e+08 hackbench.time.involuntary_context_switches
4523 3% 4675 hackbench.time.percent_of_cpu_this_job_got
27089 3% 27956 hackbench.time.system_time
1394 -10% 1252 hackbench.time.user_time
2.501e+09 -11% 2.223e+09 hackbench.time.voluntary_context_switches
779669 -14% 667478 hackbench.time.minor_page_faults
319399 3% 329894 interrupts.CAL:Function_call_interrupts
884938 -22% 692644 vmstat.system.in
5224554 -9% 4736985 vmstat.system.cs
2880 2955 turbostat.Avg_MHz
96.25 98.77 turbostat.%Busy
6.59 -14% 5.63 turbostat.RAMWatt
2.009e+08 98% 3.986e+08 perf-stat.cpu-migrations
0.67 8% 0.73 perf-stat.branch-miss-rate%
5.046e+11 13% 5.722e+11 perf-stat.cache-references
5.897e+10 5% 6.22e+10 perf-stat.branch-misses
3851 16% 4471 perf-stat.instructions-per-iTLB-miss
38.80 -11% 34.53 perf-stat.node-store-miss-rate%
8.697e+13 8.833e+13 perf-stat.cpu-cycles
1928944 -8% 1777815 perf-stat.page-faults
1928944 -8% 1777789 perf-stat.minor-faults
1.332e+10 ± 3% -18% 1.098e+10 ± 16% perf-stat.dTLB-store-misses
1.87 ± 4% -20% 1.50 ± 19% perf-stat.dTLB-load-miss-rate%
2.654e+11 ± 4% -25% 1.988e+11 ± 20% perf-stat.dTLB-load-misses
0.53 -6% 0.50 perf-stat.ipc
8.738e+12 8.565e+12 perf-stat.branch-instructions
3.299e+09 -10% 2.968e+09 perf-stat.context-switches
4.586e+13 -4% 4.398e+13 perf-stat.instructions
64.05 31% 84.13 perf-stat.iTLB-load-miss-rate%
1.392e+13 -6% 1.306e+13 perf-stat.dTLB-loads
8.613e+12 -10% 7.773e+12 perf-stat.dTLB-stores
1.135e+10 ± 4% -45% 6.254e+09 ± 4% perf-stat.node-loads
1.878e+10 ± 3% -46% 1.016e+10 ± 4% perf-stat.cache-misses
1.1e+10 ± 4% -46% 5.949e+09 ± 4% perf-stat.node-load-misses
1.191e+10 -17% 9.836e+09 perf-stat.iTLB-load-misses
7.431e+09 ± 4% -48% 3.875e+09 ± 4% perf-stat.node-stores
3.72 ± 4% -52% 1.78 ± 4% perf-stat.cache-miss-rate%
4.711e+09 ± 4% -57% 2.044e+09 ± 3% perf-stat.node-store-misses
6.682e+09 -72% 1.856e+09 ± 3% perf-stat.iTLB-loads
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Ying Huang
View attachment "config-4.10.0-11074-g4c77b18" of type "text/plain" (157299 bytes)
View attachment "job-script" of type "text/plain" (6579 bytes)
View attachment "job.yaml" of type "text/plain" (4205 bytes)
View attachment "reproduce" of type "text/plain" (970 bytes)
Powered by blists - more mailing lists