[<prev] [next>] [day] [month] [year] [list]
Date: Mon, 11 Nov 2019 17:57:39 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Cc: 0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
lkp@...ts.01.org
Subject: [cpuidle] 331b89c842: unixbench.score 5.6% improvement
Greeting,
FYI, we noticed a 5.6% improvement of unixbench.score due to commit:
commit: 331b89c8420810efbf395f3bba69b075b56a4c11 ("cpuidle: Use nanoseconds as the unit of time")
https://github.com/0day-ci/linux UPDATE-20191109-152231/Rafael-J-Wysocki/cpuidle-Use-nanoseconds-as-the-unit-of-time/20191109-002344
in testcase: unixbench
on test machine: 160 threads Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz with 256G memory
with following parameters:
runtime: 300s
nr_task: 1
test: shell8
cpufreq_governor: performance
ucode: 0xb000038
test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
test-url: https://github.com/kdlucas/byte-unixbench
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/1/debian-x86_64-2019-09-23.cgz/300s/lkp-bdw-ex2/shell8/unixbench/0xb000038
commit:
737d3c9826 ("Merge branch 'acpi-mm' into linux-next")
331b89c842 ("cpuidle: Use nanoseconds as the unit of time")
737d3c982616413a 331b89c8420810efbf395f3bba6
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -25% :4 dmesg.WARNING:at#for_ip_swapgs_restore_regs_and_return_to_usermode/0x
1:4 -25% :4 dmesg.WARNING:stack_recursion
0:4 9% 1:4 perf-profile.children.cycles-pp.error_entry
0:4 9% 1:4 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
7937 +5.6% 8378 unixbench.score
3748 ± 7% +14.5% 4292 ± 2% unixbench.time.involuntary_context_switches
35796981 +5.6% 37795787 unixbench.time.minor_page_faults
187.02 +1.8% 190.34 unixbench.time.user_time
1073178 +7.2% 1150878 unixbench.time.voluntary_context_switches
300020 +5.6% 316708 unixbench.workload
42454 +7.9% 45813 vmstat.system.cs
7042224 ±171% +214.7% 22164286 ± 57% numa-numastat.node3.local_node
7065516 ±170% +214.0% 22187613 ± 57% numa-numastat.node3.numa_hit
859.50 ± 8% +29.2% 1110 ± 7% slabinfo.skbuff_fclone_cache.active_objs
859.50 ± 8% +29.2% 1110 ± 7% slabinfo.skbuff_fclone_cache.num_objs
247.29 ± 2% +7.3% 265.24 ± 4% sched_debug.cfs_rq:/.util_avg.stddev
1.13 ± 92% +512.9% 6.93 ± 32% sched_debug.cpu.ttwu_count.avg
79.25 ±114% +745.1% 669.75 ± 36% sched_debug.cpu.ttwu_count.max
7.61 ±103% +747.2% 64.48 ± 37% sched_debug.cpu.ttwu_count.stddev
6453045 ±116% +98465.0% 6.36e+09 ± 17% cpuidle.C1.time
499294 ±134% +2931.8% 15137735 ± 2% cpuidle.C1.usage
4.141e+08 ± 84% +917.5% 4.213e+09 ± 14% cpuidle.C1E.time
8.01e+09 ± 38% -93.9% 4.866e+08 ± 2% cpuidle.C6.time
9305687 ± 34% -94.5% 508125 cpuidle.C6.usage
64076299 ±166% +463.1% 3.608e+08 ± 14% cpuidle.POLL.time
175398 ±152% +467.5% 995322 ± 11% cpuidle.POLL.usage
1738 ± 4% -6.7% 1622 ± 2% proc-vmstat.nr_page_table_pages
28099906 +5.4% 29615389 proc-vmstat.numa_hit
28006746 +5.4% 29522237 proc-vmstat.numa_local
29724990 +5.5% 31345142 proc-vmstat.pgalloc_normal
36048821 +5.6% 38050470 proc-vmstat.pgfault
29626806 +5.5% 31246186 proc-vmstat.pgfree
533606 +5.6% 563332 proc-vmstat.unevictable_pgs_culled
8807 ±112% +176.9% 24384 ± 50% numa-vmstat.node0.nr_active_anon
8743 ±113% +177.9% 24299 ± 50% numa-vmstat.node0.nr_anon_pages
8807 ±112% +176.9% 24384 ± 50% numa-vmstat.node0.nr_zone_active_anon
1661 ± 91% -88.6% 189.25 ± 95% numa-vmstat.node2.nr_inactive_anon
2192 ± 30% -30.0% 1534 ± 18% numa-vmstat.node2.nr_mapped
1773 ± 86% -88.8% 198.75 ± 88% numa-vmstat.node2.nr_shmem
7638 ± 22% -45.5% 4164 ± 16% numa-vmstat.node2.nr_slab_reclaimable
17941 ± 7% -31.0% 12382 ± 21% numa-vmstat.node2.nr_slab_unreclaimable
1661 ± 91% -88.6% 189.25 ± 95% numa-vmstat.node2.nr_zone_inactive_anon
22.00 ±104% +244.3% 75.75 ± 37% numa-vmstat.node3.nr_inactive_file
1717 ± 20% +54.7% 2656 ± 15% numa-vmstat.node3.nr_mapped
22.00 ±104% +244.3% 75.75 ± 37% numa-vmstat.node3.nr_zone_inactive_file
3504467 ±163% +208.4% 10806584 ± 56% numa-vmstat.node3.numa_local
154.75 ± 11% +26.3% 195.50 ± 5% turbostat.Avg_MHz
2094 ± 3% +24.0% 2596 turbostat.Bzy_MHz
497563 ±135% +2945.6% 15153724 ± 3% turbostat.C1
0.05 ±116% +50.7 50.73 ± 17% turbostat.C1%
3.30 ± 84% +30.3 33.62 ± 14% turbostat.C1E%
9262728 ± 34% -95.2% 444422 ± 2% turbostat.C6
63.79 ± 38% -60.0 3.78 ± 2% turbostat.C6%
30.71 ± 14% +191.4% 89.50 turbostat.CPU%c1
19.40 ± 97% -97.3% 0.53 ±120% turbostat.CPU%c3
42.51 ± 52% -94.3% 2.42 ± 3% turbostat.CPU%c6
47.25 ± 5% -99.1% 0.41 ± 30% turbostat.Pkg%pc2
291.09 ± 2% +32.0% 384.37 turbostat.PkgWatt
21.14 ± 8% +54.4% 32.64 turbostat.RAMWatt
35273 ±111% +176.5% 97535 ± 50% numa-meminfo.node0.Active
35227 ±112% +176.9% 97535 ± 50% numa-meminfo.node0.Active(anon)
34969 ±113% +178.0% 97200 ± 50% numa-meminfo.node0.AnonPages
107825 ± 8% +16.9% 126002 ± 7% numa-meminfo.node0.Slab
6890 ± 87% -87.6% 853.50 ± 76% numa-meminfo.node2.Inactive
6645 ± 91% -88.6% 757.75 ± 95% numa-meminfo.node2.Inactive(anon)
30550 ± 22% -45.5% 16660 ± 16% numa-meminfo.node2.KReclaimable
8670 ± 27% -28.8% 6169 ± 20% numa-meminfo.node2.Mapped
30550 ± 22% -45.5% 16660 ± 16% numa-meminfo.node2.SReclaimable
71835 ± 7% -30.9% 49656 ± 22% numa-meminfo.node2.SUnreclaim
7093 ± 86% -88.8% 795.00 ± 88% numa-meminfo.node2.Shmem
102385 ± 8% -35.2% 66317 ± 20% numa-meminfo.node2.Slab
6634 ± 17% +55.3% 10301 ± 14% numa-meminfo.node3.Mapped
571465 ± 15% +19.3% 681780 ± 15% numa-meminfo.node3.MemUsed
12.14 ± 29% -6.0 6.13 ± 15% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
12.14 ± 29% -6.0 6.13 ± 15% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.61 ± 4% -0.6 1.04 ± 29% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
1.61 ± 4% -0.4 1.21 ± 18% perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
52.16 ± 27% +22.9 75.03 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
52.16 ± 27% +22.9 75.03 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
52.16 ± 27% +22.9 75.03 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
51.30 ± 27% +22.9 74.24 perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
52.26 ± 27% +23.0 75.28 perf-profile.calltrace.cycles-pp.secondary_startup_64
51.00 ± 27% +23.1 74.11 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
35.87 ± 46% -18.8 17.11 ± 12% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
35.73 ± 47% -18.6 17.11 ± 12% perf-profile.children.cycles-pp.do_syscall_64
9.93 ± 34% -5.6 4.32 ± 32% perf-profile.children.cycles-pp.__fput
10.03 ± 34% -5.5 4.57 ± 26% perf-profile.children.cycles-pp.task_work_run
10.14 ± 35% -5.4 4.74 ± 22% perf-profile.children.cycles-pp.exit_to_usermode_loop
52.16 ± 27% +22.9 75.03 perf-profile.children.cycles-pp.start_secondary
52.26 ± 27% +23.0 75.28 perf-profile.children.cycles-pp.secondary_startup_64
52.26 ± 27% +23.0 75.28 perf-profile.children.cycles-pp.cpu_startup_entry
52.26 ± 27% +23.0 75.28 perf-profile.children.cycles-pp.do_idle
51.40 ± 27% +23.1 74.48 perf-profile.children.cycles-pp.cpuidle_enter
51.40 ± 27% +23.1 74.48 perf-profile.children.cycles-pp.cpuidle_enter_state
88.00 ± 14% -17.3% 72.75 ± 8% interrupts.51:PCI-MSI.1572864-edge.eth0-TxRx-0
297179 +7.5% 319568 interrupts.CAL:Function_call_interrupts
88.00 ± 14% -17.3% 72.75 ± 8% interrupts.CPU0.51:PCI-MSI.1572864-edge.eth0-TxRx-0
5.25 ± 89% +1390.5% 78.25 ± 30% interrupts.CPU0.TLB:TLB_shootdowns
1535 +11.6% 1714 ± 4% interrupts.CPU16.CAL:Function_call_interrupts
1521 ± 3% +10.4% 1679 ± 2% interrupts.CPU164.CAL:Function_call_interrupts
176.00 ± 98% -60.5% 69.50 ±173% interrupts.CPU167.RES:Rescheduling_interrupts
79.00 ±173% +255.1% 280.50 ± 62% interrupts.CPU169.RES:Rescheduling_interrupts
1522 ± 2% +10.3% 1679 ± 2% interrupts.CPU177.CAL:Function_call_interrupts
1536 +9.8% 1687 ± 2% interrupts.CPU179.CAL:Function_call_interrupts
1542 +9.7% 1691 ± 2% interrupts.CPU180.CAL:Function_call_interrupts
1547 +9.3% 1691 ± 2% interrupts.CPU181.CAL:Function_call_interrupts
1554 +9.8% 1707 ± 4% interrupts.CPU23.CAL:Function_call_interrupts
4.75 ± 34% +1673.7% 84.25 ± 30% interrupts.CPU24.TLB:TLB_shootdowns
177.75 ± 93% -63.9% 64.25 ±170% interrupts.CPU48.RES:Rescheduling_interrupts
163.50 ± 96% -61.0% 63.75 ±168% interrupts.CPU49.RES:Rescheduling_interrupts
185.75 ± 99% -55.2% 83.25 ±173% interrupts.CPU70.RES:Rescheduling_interrupts
1487 ± 8% +15.3% 1715 ± 2% interrupts.CPU72.CAL:Function_call_interrupts
86.50 ±170% +257.2% 309.00 ± 62% interrupts.CPU73.RES:Rescheduling_interrupts
98.75 ±172% +210.4% 306.50 ± 54% interrupts.CPU74.RES:Rescheduling_interrupts
88.00 ±173% +206.5% 269.75 ± 59% interrupts.CPU76.RES:Rescheduling_interrupts
1545 +12.0% 1730 ± 5% interrupts.CPU83.CAL:Function_call_interrupts
129.50 ±131% +164.1% 342.00 ± 49% interrupts.CPU95.RES:Rescheduling_interrupts
64320 ± 13% +15.2% 74070 ± 4% interrupts.RES:Rescheduling_interrupts
241.00 ± 5% +489.6% 1421 ± 4% interrupts.TLB:TLB_shootdowns
24.14 ± 4% -45.7% 13.10 ± 2% perf-stat.i.MPKI
3.758e+09 ± 8% +34.5% 5.053e+09 ± 3% perf-stat.i.branch-instructions
2.97 ± 8% -1.2 1.76 ± 3% perf-stat.i.branch-miss-rate%
1.068e+08 ± 6% -18.7% 86810106 perf-stat.i.branch-misses
0.38 ± 5% +0.2 0.62 ± 11% perf-stat.i.cache-miss-rate%
4.194e+08 ± 2% -27.4% 3.047e+08 perf-stat.i.cache-references
44539 +6.9% 47630 perf-stat.i.context-switches
1.81 ± 4% -11.3% 1.61 ± 2% perf-stat.i.cpi
0.40 ± 4% -0.2 0.22 perf-stat.i.dTLB-load-miss-rate%
18597280 ± 4% -28.3% 13337508 ± 4% perf-stat.i.dTLB-load-misses
4.772e+09 ± 5% +27.9% 6.104e+09 ± 4% perf-stat.i.dTLB-loads
0.13 ± 2% -0.0 0.09 perf-stat.i.dTLB-store-miss-rate%
4559225 ± 3% -36.3% 2902725 ± 10% perf-stat.i.dTLB-store-misses
61.28 ± 4% -10.9 50.41 ± 2% perf-stat.i.iTLB-load-miss-rate%
7316157 ± 4% -12.0% 6441005 ± 5% perf-stat.i.iTLB-load-misses
4711889 ± 9% +31.0% 6173969 ± 3% perf-stat.i.iTLB-loads
1.841e+10 ± 5% +26.3% 2.325e+10 ± 2% perf-stat.i.instructions
2501 ± 7% +61.3% 4034 ± 17% perf-stat.i.instructions-per-iTLB-miss
558522 +5.0% 586595 perf-stat.i.minor-faults
558460 +5.0% 586582 perf-stat.i.page-faults
22.84 ± 4% -42.6% 13.11 ± 2% perf-stat.overall.MPKI
2.85 ± 8% -1.1 1.72 ± 2% perf-stat.overall.branch-miss-rate%
0.36 ± 4% +0.2 0.53 ± 4% perf-stat.overall.cache-miss-rate%
0.39 ± 4% -0.2 0.22 perf-stat.overall.dTLB-load-miss-rate%
0.13 ± 2% -0.0 0.09 perf-stat.overall.dTLB-store-miss-rate%
60.88 ± 4% -9.9 51.02 perf-stat.overall.iTLB-load-miss-rate%
2522 ± 7% +43.6% 3621 ± 5% perf-stat.overall.instructions-per-iTLB-miss
3846863 ± 6% +20.4% 4632272 ± 2% perf-stat.overall.path-length
3.693e+09 ± 8% +34.5% 4.966e+09 ± 3% perf-stat.ps.branch-instructions
1.05e+08 ± 6% -18.7% 85353443 perf-stat.ps.branch-misses
4.123e+08 ± 2% -27.4% 2.994e+08 perf-stat.ps.cache-references
43756 +6.9% 46791 perf-stat.ps.context-switches
18276268 ± 4% -28.3% 13106079 ± 4% perf-stat.ps.dTLB-load-misses
4.689e+09 ± 5% +27.9% 5.999e+09 ± 4% perf-stat.ps.dTLB-loads
4481504 ± 3% -36.3% 2853056 ± 10% perf-stat.ps.dTLB-store-misses
7189784 ± 4% -12.0% 6328997 ± 5% perf-stat.ps.iTLB-load-misses
4629950 ± 9% +31.1% 6067676 ± 3% perf-stat.ps.iTLB-loads
1.809e+10 ± 5% +26.3% 2.285e+10 ± 2% perf-stat.ps.instructions
548686 +5.0% 576249 perf-stat.ps.minor-faults
548625 +5.0% 576236 perf-stat.ps.page-faults
1.154e+12 ± 5% +27.2% 1.467e+12 ± 2% perf-stat.total.instructions
11329 ± 11% +34.1% 15189 ± 27% softirqs.CPU1.SCHED
17634 ± 8% +10.0% 19395 ± 7% softirqs.CPU11.RCU
30187 ± 7% -22.3% 23445 softirqs.CPU112.TIMER
15975 ± 4% +9.6% 17513 softirqs.CPU129.RCU
8458 +31.6% 11129 ± 26% softirqs.CPU14.SCHED
8802 ± 7% +12.0% 9862 ± 3% softirqs.CPU168.SCHED
8805 +12.9% 9939 softirqs.CPU169.SCHED
16049 ± 10% +19.9% 19236 ± 6% softirqs.CPU171.RCU
8868 ± 2% +12.4% 9971 ± 3% softirqs.CPU171.SCHED
28474 ± 12% -14.3% 24400 ± 2% softirqs.CPU172.TIMER
16350 ± 13% +18.6% 19394 ± 5% softirqs.CPU173.RCU
8940 ± 4% +9.8% 9817 ± 3% softirqs.CPU173.SCHED
16281 ± 15% +25.4% 20423 ± 10% softirqs.CPU177.RCU
8679 ± 2% +18.3% 10269 ± 10% softirqs.CPU177.SCHED
8861 ± 2% +12.0% 9926 ± 5% softirqs.CPU182.SCHED
8998 ± 3% +8.3% 9744 ± 4% softirqs.CPU183.SCHED
8820 +12.8% 9947 ± 3% softirqs.CPU185.SCHED
8811 ± 2% +12.7% 9933 ± 4% softirqs.CPU186.SCHED
8748 ± 3% +12.7% 9855 ± 5% softirqs.CPU187.SCHED
8653 ± 9% +13.9% 9859 ± 4% softirqs.CPU189.SCHED
15813 ± 4% +18.7% 18764 softirqs.CPU22.RCU
9725 ± 3% +14.9% 11177 ± 12% softirqs.CPU24.SCHED
16568 ± 7% +13.5% 18799 ± 7% softirqs.CPU25.RCU
8785 ± 5% +16.2% 10211 ± 15% softirqs.CPU30.SCHED
28116 ± 5% -13.1% 24419 ± 2% softirqs.CPU31.TIMER
8617 ± 4% +16.7% 10058 ± 10% softirqs.CPU33.SCHED
8949 ± 6% +13.5% 10158 ± 9% softirqs.CPU60.SCHED
9187 ± 6% +11.7% 10259 ± 5% softirqs.CPU72.SCHED
9053 ± 3% +13.4% 10265 ± 2% softirqs.CPU73.SCHED
16876 ± 11% +13.3% 19127 ± 7% softirqs.CPU75.RCU
16897 ± 11% +17.2% 19797 ± 7% softirqs.CPU76.RCU
16703 ± 13% +19.6% 19985 ± 6% softirqs.CPU77.RCU
8777 ± 4% +14.2% 10022 ± 3% softirqs.CPU77.SCHED
8630 ± 6% +15.4% 9955 ± 3% softirqs.CPU78.SCHED
9162 +12.1% 10269 ± 4% softirqs.CPU80.SCHED
8905 ± 3% +11.5% 9929 ± 4% softirqs.CPU84.SCHED
9097 +11.2% 10116 ± 3% softirqs.CPU85.SCHED
8757 ± 3% +16.5% 10199 ± 7% softirqs.CPU86.SCHED
8699 ± 2% +16.9% 10165 ± 3% softirqs.CPU87.SCHED
8655 +16.4% 10070 ± 2% softirqs.CPU88.SCHED
8727 ± 4% +15.1% 10040 ± 3% softirqs.CPU89.SCHED
16459 ± 5% +16.2% 19134 softirqs.CPU9.RCU
8680 ± 3% +17.7% 10213 ± 17% softirqs.CPU9.SCHED
8853 ± 3% +13.9% 10083 ± 2% softirqs.CPU90.SCHED
8855 ± 3% +13.1% 10017 ± 3% softirqs.CPU91.SCHED
8924 ± 4% +16.3% 10383 ± 14% softirqs.CPU97.SCHED
unixbench.score
9000 +-+------------------------------------------------------------------+
O O.O.OO O O O O O OO O O O O O OO.+.+. .++. .+ |
8000 +-+ +.+ +.+.+.+.++.+.+.+.+.+.++ +.+.+ +.+.+.+.+.+ +.+.+.|
7000 +-+ : : |
| : : |
6000 +-+ : : |
5000 +-+ : : |
| : : |
4000 +-+ : : |
3000 +-+ : : |
| : : |
2000 +-+ :: |
1000 +-+ : |
| : |
0 +-+--------------O---------------------------------------------------+
unixbench.time.minor_page_faults
4e+07 +-+---------------------------------------------------------------+
O.O.OO.O.O OO O.+.O.OO.O O O OO O.O.++.+. .+.++.+.+.+. .+.+.++.+.|
3.5e+07 +-+ + +.+ + +.+.+.++.+ + ++ |
3e+07 +-+ : : |
| : : |
2.5e+07 +-+ : : |
| : : |
2e+07 +-+ : : |
| : : |
1.5e+07 +-+ :: |
1e+07 +-+ :: |
| : |
5e+06 +-+ : |
| : |
0 +-+-------------O-------------------------------------------------+
unixbench.time.voluntary_context_switches
1.2e+06 +-+---------------------------------------------------------------+
O O.OO.O.O OO O.+.O.OO.O O O OO O O.++.+. .++.+.+.+. .+.+.++.+.|
1e+06 +-+ : +.+ + +.+.+.++.+.+ +.+ ++ |
| : : |
| : : |
800000 +-+ : : |
| : : |
600000 +-+ : : |
| : : |
400000 +-+ :: |
| :: |
| : |
200000 +-+ : |
| : |
0 +-+-------------O-------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.4.0-rc6-00138-g331b89c842081" of type "text/plain" (200581 bytes)
View attachment "job-script" of type "text/plain" (7651 bytes)
View attachment "job.yaml" of type "text/plain" (5266 bytes)
View attachment "reproduce" of type "text/plain" (290 bytes)
Powered by blists - more mailing lists