[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210214141833.GE6321@xsang-OptiPlex-9020>
Date: Sun, 14 Feb 2021 22:18:33 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Juergen Gross <jgross@...e.com>
Cc: Borislav Petkov <bp@...e.de>, Andy Lutomirski <luto@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [x86/pv] ab234a260b: stress-ng.timerfd.ops_per_sec 6.6% improvement
Greeting,
FYI, we noticed a 6.6% improvement of stress-ng.timerfd.ops_per_sec due to commit:
commit: ab234a260b1f625b26cbefa93ca365b0ae66df33 ("x86/pv: Rework arch_local_irq_restore() to not use popf")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git x86/paravirt
in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:
nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: ext4
class: os
test: timerfd
cpufreq_governor: performance
ucode: 0x5003003
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
os/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/timerfd/stress-ng/60s/0x5003003
commit:
afd30525a6 ("x86/xen: Drop USERGS_SYSRET64 paravirt call")
ab234a260b ("x86/pv: Rework arch_local_irq_restore() to not use popf")
afd30525a659ac0a ab234a260b1f625b26cbefa93ca
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
0:4 34% 1:4 perf-profile.calltrace.cycles-pp.error_entry
3:4 12% 3:4 perf-profile.children.cycles-pp.error_entry
1:4 -1% 1:4 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
675.25 +1.6% 686.00 stress-ng.time.percent_of_cpu_this_job_got
376.77 -1.4% 371.41 stress-ng.time.system_time
42.56 ± 2% +28.5% 54.70 stress-ng.time.user_time
5.309e+08 +6.6% 5.66e+08 stress-ng.timerfd.ops
8847658 +6.6% 9432727 stress-ng.timerfd.ops_per_sec
8.81 -1.9% 8.64 iostat.cpu.system
0.73 ± 2% +0.2 0.93 mpstat.cpu.all.usr%
291454 -0.9% 288975 proc-vmstat.numa_local
293563 ± 2% +15.2% 338198 softirqs.RCU
4506538 +4.4% 4706667 vmstat.system.in
5.75 ± 23% -26.3% 4.24 ± 10% perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
4.95 -14.6% 4.23 ± 10% perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
1797 ± 7% -13.0% 1563 ± 8% slabinfo.khugepaged_mm_slot.active_objs
1797 ± 7% -13.0% 1563 ± 8% slabinfo.khugepaged_mm_slot.num_objs
9508 ± 3% -8.8% 8672 ± 5% numa-vmstat.node0.nr_kernel_stack
655.75 ± 7% -25.6% 488.00 ± 12% numa-vmstat.node0.nr_page_table_pages
9875 ± 5% -7.8% 9105 ± 4% numa-vmstat.node0.nr_slab_reclaimable
565.75 ± 8% +29.9% 734.75 ± 8% numa-vmstat.node1.nr_page_table_pages
39502 ± 5% -7.8% 36424 ± 4% numa-meminfo.node0.KReclaimable
9508 ± 3% -8.8% 8672 ± 5% numa-meminfo.node0.KernelStack
2623 ± 7% -25.4% 1956 ± 12% numa-meminfo.node0.PageTables
39502 ± 5% -7.8% 36424 ± 4% numa-meminfo.node0.SReclaimable
2264 ± 8% +30.1% 2946 ± 8% numa-meminfo.node1.PageTables
0.14 ± 8% +25.0% 0.18 ± 5% sched_debug.cfs_rq:/.nr_running.avg
0.35 ± 3% +9.6% 0.38 ± 2% sched_debug.cfs_rq:/.nr_running.stddev
1047995 ± 7% +47.2% 1542703 ± 13% sched_debug.cpu.avg_idle.max
262.12 ± 4% +18.3% 310.09 ± 7% sched_debug.cpu.curr->pid.avg
0.12 ± 4% +21.3% 0.14 ± 3% sched_debug.cpu.nr_running.avg
0.32 ± 2% +10.9% 0.35 ± 2% sched_debug.cpu.nr_running.stddev
582.50 ± 25% +337.1% 2546 ±115% interrupts.CPU1.CAL:Function_call_interrupts
436.25 ±124% +221.3% 1401 ± 31% interrupts.CPU1.NMI:Non-maskable_interrupts
436.25 ±124% +221.3% 1401 ± 31% interrupts.CPU1.PMI:Performance_monitoring_interrupts
606.25 ± 51% +262.5% 2197 ±105% interrupts.CPU11.CAL:Function_call_interrupts
627.50 ± 20% -21.0% 495.50 interrupts.CPU18.CAL:Function_call_interrupts
1327 ± 65% -90.8% 122.50 ± 23% interrupts.CPU28.NMI:Non-maskable_interrupts
1327 ± 65% -90.8% 122.50 ± 23% interrupts.CPU28.PMI:Performance_monitoring_interrupts
96.75 ± 32% +248.6% 337.25 ± 59% interrupts.CPU47.NMI:Non-maskable_interrupts
96.75 ± 32% +248.6% 337.25 ± 59% interrupts.CPU47.PMI:Performance_monitoring_interrupts
318.50 ±128% +753.6% 2718 ± 58% interrupts.CPU49.NMI:Non-maskable_interrupts
318.50 ±128% +753.6% 2718 ± 58% interrupts.CPU49.PMI:Performance_monitoring_interrupts
2698 ± 31% -59.1% 1104 ± 52% interrupts.CPU5.NMI:Non-maskable_interrupts
2698 ± 31% -59.1% 1104 ± 52% interrupts.CPU5.PMI:Performance_monitoring_interrupts
2386946 ± 46% +184.0% 6779268 ± 30% interrupts.CPU64.LOC:Local_timer_interrupts
533.00 ± 5% -7.1% 495.00 interrupts.CPU68.CAL:Function_call_interrupts
689256 ± 57% +222.6% 2223739 ± 33% interrupts.CPU7.LOC:Local_timer_interrupts
2.00 ± 93% +2175.0% 45.50 ±133% interrupts.CPU7.RES:Rescheduling_interrupts
431.25 ±132% +471.4% 2464 ±129% interrupts.CPU74.NMI:Non-maskable_interrupts
431.25 ±132% +471.4% 2464 ±129% interrupts.CPU74.PMI:Performance_monitoring_interrupts
2349 ±124% -93.8% 146.25 ± 6% interrupts.CPU76.NMI:Non-maskable_interrupts
2349 ±124% -93.8% 146.25 ± 6% interrupts.CPU76.PMI:Performance_monitoring_interrupts
1890196 ± 62% +190.4% 5490038 ± 34% interrupts.CPU79.LOC:Local_timer_interrupts
107.25 ± 21% +149.7% 267.75 ± 86% interrupts.CPU93.NMI:Non-maskable_interrupts
107.25 ± 21% +149.7% 267.75 ± 86% interrupts.CPU93.PMI:Performance_monitoring_interrupts
124.00 ± 25% +111.3% 262.00 ± 44% interrupts.CPU95.NMI:Non-maskable_interrupts
124.00 ± 25% +111.3% 262.00 ± 44% interrupts.CPU95.PMI:Performance_monitoring_interrupts
994.25 ± 15% +34.4% 1336 ± 12% interrupts.RES:Rescheduling_interrupts
4.801e+09 +6.6% 5.119e+09 perf-stat.i.branch-instructions
99909476 +5.4% 1.053e+08 perf-stat.i.branch-misses
17.72 ± 3% +0.6 18.35 ± 2% perf-stat.i.cache-miss-rate%
1664858 ± 7% +10.4% 1837658 perf-stat.i.cache-misses
1.17 -4.1% 1.12 perf-stat.i.cpi
2.758e+10 +1.1% 2.789e+10 perf-stat.i.cpu-cycles
6.845e+09 +6.2% 7.269e+09 perf-stat.i.dTLB-loads
0.02 ± 3% +0.0 0.03 ± 3% perf-stat.i.dTLB-store-miss-rate%
998610 ± 4% +32.4% 1321727 ± 2% perf-stat.i.dTLB-store-misses
4.522e+09 +4.6% 4.731e+09 perf-stat.i.dTLB-stores
2.408e+10 +5.7% 2.545e+10 perf-stat.i.instructions
0.86 +4.3% 0.90 perf-stat.i.ipc
0.29 +1.1% 0.29 perf-stat.i.metric.GHz
168.55 +5.9% 178.46 perf-stat.i.metric.M/sec
2.08 -0.0 2.06 perf-stat.overall.branch-miss-rate%
1.15 -4.3% 1.10 perf-stat.overall.cpi
0.02 ± 4% +0.0 0.03 ± 2% perf-stat.overall.dTLB-store-miss-rate%
0.87 +4.5% 0.91 perf-stat.overall.ipc
4.723e+09 +6.6% 5.036e+09 perf-stat.ps.branch-instructions
98286780 +5.4% 1.036e+08 perf-stat.ps.branch-misses
1638114 ± 7% +10.4% 1808597 perf-stat.ps.cache-misses
2.714e+10 +1.1% 2.744e+10 perf-stat.ps.cpu-cycles
6.734e+09 +6.2% 7.151e+09 perf-stat.ps.dTLB-loads
982410 ± 4% +32.4% 1300313 ± 2% perf-stat.ps.dTLB-store-misses
4.449e+09 +4.6% 4.654e+09 perf-stat.ps.dTLB-stores
2.369e+10 +5.7% 2.504e+10 perf-stat.ps.instructions
1.489e+12 +6.0% 1.578e+12 perf-stat.total.instructions
9.09 ± 9% -1.5 7.54 ± 9% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.timerfd_read.vfs_read.ksys_read.do_syscall_64
3.00 ± 8% -1.1 1.90 ± 9% perf-profile.calltrace.cycles-pp.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.00 ± 8% -1.1 1.90 ± 9% perf-profile.calltrace.cycles-pp.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.96 ± 8% -1.1 1.86 ± 9% perf-profile.calltrace.cycles-pp.core_sys_select.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.89 ± 8% -1.1 1.80 ± 9% perf-profile.calltrace.cycles-pp.do_select.core_sys_select.kern_select.__x64_sys_select.do_syscall_64
5.18 ± 9% -1.0 4.17 ± 9% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.timerfd_read.vfs_read.ksys_read
4.68 ± 9% -1.0 3.72 ± 9% perf-profile.calltrace.cycles-pp.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.timerfd_read.vfs_read
4.64 ± 9% -1.0 3.69 ± 9% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.timerfd_read
1.68 ± 9% -0.5 1.14 ± 10% perf-profile.calltrace.cycles-pp.timerfd_poll.do_select.core_sys_select.kern_select.__x64_sys_select
0.77 ± 11% -0.4 0.40 ± 57% perf-profile.calltrace.cycles-pp.timerfd_tmrproc.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack
1.41 ± 5% +0.3 1.70 ± 8% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
1.46 ± 5% +0.3 1.79 ± 9% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
1.63 ± 5% +0.4 2.02 ± 9% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
1.71 ± 4% +0.4 2.14 ± 10% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt
0.26 ±100% +0.4 0.70 ± 10% perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.27 ±100% +0.6 0.84 ± 8% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.do_timerfd_gettime.__x64_sys_timerfd_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.22 ± 7% -2.6 0.62 ± 7% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
2.97 ± 8% -1.1 1.86 ± 9% perf-profile.children.cycles-pp.core_sys_select
2.94 ± 8% -1.1 1.83 ± 9% perf-profile.children.cycles-pp.do_select
3.00 ± 8% -1.1 1.90 ± 9% perf-profile.children.cycles-pp.kern_select
3.00 ± 8% -1.1 1.90 ± 9% perf-profile.children.cycles-pp.__x64_sys_select
1.70 ± 9% -0.5 1.17 ± 11% perf-profile.children.cycles-pp.timerfd_poll
2.65 ± 10% -0.4 2.20 ± 10% perf-profile.children.cycles-pp.__fget_light
2.25 ± 10% -0.4 1.83 ± 9% perf-profile.children.cycles-pp.timerfd_tmrproc
0.30 ± 5% +0.1 0.41 ± 11% perf-profile.children.cycles-pp.sync_regs
3.19 ± 8% -2.6 0.58 ± 8% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
2.04 ± 10% -0.5 1.58 ± 9% perf-profile.self.cycles-pp.__fget_light
0.18 ± 6% -0.0 0.13 ± 18% perf-profile.self.cycles-pp.tick_program_event
0.30 ± 4% +0.1 0.40 ± 11% perf-profile.self.cycles-pp.sync_regs
0.80 ± 8% +0.2 1.00 ± 7% perf-profile.self.cycles-pp.do_timerfd_gettime
stress-ng.timerfd.ops_per_sec
1e+07 +-----------------------------------------------------------------+
9.5e+06 |-+ O O OO OO O O O O O O OO O O |
| O OO O O O O O O O O O O O O O O O |
9e+06 |.+.++.+.+.++.+.+.+.++.+.+.+.+ +.+.++.+.+.+.++.+.+.+.++.+ |
8.5e+06 |-+ : : |
| : : |
8e+06 |-+ : : |
7.5e+06 |-+ : : |
7e+06 |-+ :: |
| :: |
6.5e+06 |-+ :: |
6e+06 |-+ : |
| : |
5.5e+06 |-+ + O O |
5e+06 +-----------------------------------------------------------------+
stress-ng.time.user_time
300 +---------------------------------------------------------------------+
| |
250 |-+ O O |
| + |
| : |
200 |-+ : |
| : : |
150 |-+ : : |
| : : |
100 |-+ : : |
| : : |
| O O O OO O O O O O O OO O : O : O O O O OO O O O O O OO O O O |
50 |.+.+.+.++.+.+.+.+.+.+.+.++.+.+ +.+.+.+.+.++.+.+.+.+.+.+.+.+ |
| |
0 +---------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.11.0-rc7-00005-gab234a260b1f" of type "text/plain" (174007 bytes)
View attachment "job-script" of type "text/plain" (8123 bytes)
View attachment "job.yaml" of type "text/plain" (5625 bytes)
View attachment "reproduce" of type "text/plain" (535 bytes)
Powered by blists - more mailing lists