lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20210214141833.GE6321@xsang-OptiPlex-9020> Date: Sun, 14 Feb 2021 22:18:33 +0800 From: kernel test robot <oliver.sang@...el.com> To: Juergen Gross <jgross@...e.com> Cc: Borislav Petkov <bp@...e.de>, Andy Lutomirski <luto@...nel.org>, LKML <linux-kernel@...r.kernel.org>, x86@...nel.org, lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com Subject: [x86/pv] ab234a260b: stress-ng.timerfd.ops_per_sec 6.6% improvement Greeting, FYI, we noticed a 6.6% improvement of stress-ng.timerfd.ops_per_sec due to commit: commit: ab234a260b1f625b26cbefa93ca365b0ae66df33 ("x86/pv: Rework arch_local_irq_restore() to not use popf") https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git x86/paravirt in testcase: stress-ng on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory with following parameters: nr_threads: 10% disk: 1HDD testtime: 60s fs: ext4 class: os test: timerfd cpufreq_governor: performance ucode: 0x5003003 Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml bin/lkp run compatible-job.yaml ========================================================================================= class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode: os/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/timerfd/stress-ng/60s/0x5003003 commit: afd30525a6 ("x86/xen: Drop USERGS_SYSRET64 paravirt call") ab234a260b ("x86/pv: Rework arch_local_irq_restore() to not use popf") afd30525a659ac0a ab234a260b1f625b26cbefa93ca ---------------- --------------------------- fail:runs %reproduction fail:runs | | | 0:4 34% 1:4 perf-profile.calltrace.cycles-pp.error_entry 3:4 12% 3:4 perf-profile.children.cycles-pp.error_entry 1:4 -1% 1:4 perf-profile.self.cycles-pp.error_entry %stddev %change %stddev \ | \ 675.25 +1.6% 686.00 stress-ng.time.percent_of_cpu_this_job_got 376.77 -1.4% 371.41 stress-ng.time.system_time 42.56 ± 2% +28.5% 54.70 stress-ng.time.user_time 5.309e+08 +6.6% 5.66e+08 stress-ng.timerfd.ops 8847658 +6.6% 9432727 stress-ng.timerfd.ops_per_sec 8.81 -1.9% 8.64 iostat.cpu.system 0.73 ± 2% +0.2 0.93 mpstat.cpu.all.usr% 291454 -0.9% 288975 proc-vmstat.numa_local 293563 ± 2% +15.2% 338198 softirqs.RCU 4506538 +4.4% 4706667 vmstat.system.in 5.75 ± 23% -26.3% 4.24 ± 10% perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread 4.95 -14.6% 4.23 ± 10% perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread 1797 ± 7% -13.0% 1563 ± 8% slabinfo.khugepaged_mm_slot.active_objs 1797 ± 7% -13.0% 1563 ± 8% slabinfo.khugepaged_mm_slot.num_objs 9508 ± 3% -8.8% 8672 ± 5% numa-vmstat.node0.nr_kernel_stack 655.75 ± 7% -25.6% 488.00 ± 12% numa-vmstat.node0.nr_page_table_pages 9875 ± 5% -7.8% 9105 ± 4% numa-vmstat.node0.nr_slab_reclaimable 565.75 ± 8% +29.9% 734.75 ± 8% numa-vmstat.node1.nr_page_table_pages 39502 ± 5% -7.8% 36424 ± 4% numa-meminfo.node0.KReclaimable 9508 ± 3% -8.8% 8672 ± 5% numa-meminfo.node0.KernelStack 2623 ± 7% -25.4% 1956 ± 12% numa-meminfo.node0.PageTables 39502 ± 5% -7.8% 36424 ± 4% numa-meminfo.node0.SReclaimable 2264 ± 8% +30.1% 2946 ± 8% numa-meminfo.node1.PageTables 0.14 ± 8% +25.0% 0.18 ± 5% sched_debug.cfs_rq:/.nr_running.avg 0.35 ± 3% +9.6% 0.38 ± 2% sched_debug.cfs_rq:/.nr_running.stddev 1047995 ± 7% +47.2% 1542703 ± 13% sched_debug.cpu.avg_idle.max 262.12 ± 4% +18.3% 310.09 ± 7% sched_debug.cpu.curr->pid.avg 0.12 ± 4% +21.3% 0.14 ± 3% sched_debug.cpu.nr_running.avg 0.32 ± 2% +10.9% 0.35 ± 2% sched_debug.cpu.nr_running.stddev 582.50 ± 25% +337.1% 2546 ±115% interrupts.CPU1.CAL:Function_call_interrupts 436.25 ±124% +221.3% 1401 ± 31% interrupts.CPU1.NMI:Non-maskable_interrupts 436.25 ±124% +221.3% 1401 ± 31% interrupts.CPU1.PMI:Performance_monitoring_interrupts 606.25 ± 51% +262.5% 2197 ±105% interrupts.CPU11.CAL:Function_call_interrupts 627.50 ± 20% -21.0% 495.50 interrupts.CPU18.CAL:Function_call_interrupts 1327 ± 65% -90.8% 122.50 ± 23% interrupts.CPU28.NMI:Non-maskable_interrupts 1327 ± 65% -90.8% 122.50 ± 23% interrupts.CPU28.PMI:Performance_monitoring_interrupts 96.75 ± 32% +248.6% 337.25 ± 59% interrupts.CPU47.NMI:Non-maskable_interrupts 96.75 ± 32% +248.6% 337.25 ± 59% interrupts.CPU47.PMI:Performance_monitoring_interrupts 318.50 ±128% +753.6% 2718 ± 58% interrupts.CPU49.NMI:Non-maskable_interrupts 318.50 ±128% +753.6% 2718 ± 58% interrupts.CPU49.PMI:Performance_monitoring_interrupts 2698 ± 31% -59.1% 1104 ± 52% interrupts.CPU5.NMI:Non-maskable_interrupts 2698 ± 31% -59.1% 1104 ± 52% interrupts.CPU5.PMI:Performance_monitoring_interrupts 2386946 ± 46% +184.0% 6779268 ± 30% interrupts.CPU64.LOC:Local_timer_interrupts 533.00 ± 5% -7.1% 495.00 interrupts.CPU68.CAL:Function_call_interrupts 689256 ± 57% +222.6% 2223739 ± 33% interrupts.CPU7.LOC:Local_timer_interrupts 2.00 ± 93% +2175.0% 45.50 ±133% interrupts.CPU7.RES:Rescheduling_interrupts 431.25 ±132% +471.4% 2464 ±129% interrupts.CPU74.NMI:Non-maskable_interrupts 431.25 ±132% +471.4% 2464 ±129% interrupts.CPU74.PMI:Performance_monitoring_interrupts 2349 ±124% -93.8% 146.25 ± 6% interrupts.CPU76.NMI:Non-maskable_interrupts 2349 ±124% -93.8% 146.25 ± 6% interrupts.CPU76.PMI:Performance_monitoring_interrupts 1890196 ± 62% +190.4% 5490038 ± 34% interrupts.CPU79.LOC:Local_timer_interrupts 107.25 ± 21% +149.7% 267.75 ± 86% interrupts.CPU93.NMI:Non-maskable_interrupts 107.25 ± 21% +149.7% 267.75 ± 86% interrupts.CPU93.PMI:Performance_monitoring_interrupts 124.00 ± 25% +111.3% 262.00 ± 44% interrupts.CPU95.NMI:Non-maskable_interrupts 124.00 ± 25% +111.3% 262.00 ± 44% interrupts.CPU95.PMI:Performance_monitoring_interrupts 994.25 ± 15% +34.4% 1336 ± 12% interrupts.RES:Rescheduling_interrupts 4.801e+09 +6.6% 5.119e+09 perf-stat.i.branch-instructions 99909476 +5.4% 1.053e+08 perf-stat.i.branch-misses 17.72 ± 3% +0.6 18.35 ± 2% perf-stat.i.cache-miss-rate% 1664858 ± 7% +10.4% 1837658 perf-stat.i.cache-misses 1.17 -4.1% 1.12 perf-stat.i.cpi 2.758e+10 +1.1% 2.789e+10 perf-stat.i.cpu-cycles 6.845e+09 +6.2% 7.269e+09 perf-stat.i.dTLB-loads 0.02 ± 3% +0.0 0.03 ± 3% perf-stat.i.dTLB-store-miss-rate% 998610 ± 4% +32.4% 1321727 ± 2% perf-stat.i.dTLB-store-misses 4.522e+09 +4.6% 4.731e+09 perf-stat.i.dTLB-stores 2.408e+10 +5.7% 2.545e+10 perf-stat.i.instructions 0.86 +4.3% 0.90 perf-stat.i.ipc 0.29 +1.1% 0.29 perf-stat.i.metric.GHz 168.55 +5.9% 178.46 perf-stat.i.metric.M/sec 2.08 -0.0 2.06 perf-stat.overall.branch-miss-rate% 1.15 -4.3% 1.10 perf-stat.overall.cpi 0.02 ± 4% +0.0 0.03 ± 2% perf-stat.overall.dTLB-store-miss-rate% 0.87 +4.5% 0.91 perf-stat.overall.ipc 4.723e+09 +6.6% 5.036e+09 perf-stat.ps.branch-instructions 98286780 +5.4% 1.036e+08 perf-stat.ps.branch-misses 1638114 ± 7% +10.4% 1808597 perf-stat.ps.cache-misses 2.714e+10 +1.1% 2.744e+10 perf-stat.ps.cpu-cycles 6.734e+09 +6.2% 7.151e+09 perf-stat.ps.dTLB-loads 982410 ± 4% +32.4% 1300313 ± 2% perf-stat.ps.dTLB-store-misses 4.449e+09 +4.6% 4.654e+09 perf-stat.ps.dTLB-stores 2.369e+10 +5.7% 2.504e+10 perf-stat.ps.instructions 1.489e+12 +6.0% 1.578e+12 perf-stat.total.instructions 9.09 ± 9% -1.5 7.54 ± 9% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.timerfd_read.vfs_read.ksys_read.do_syscall_64 3.00 ± 8% -1.1 1.90 ± 9% perf-profile.calltrace.cycles-pp.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe 3.00 ± 8% -1.1 1.90 ± 9% perf-profile.calltrace.cycles-pp.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.96 ± 8% -1.1 1.86 ± 9% perf-profile.calltrace.cycles-pp.core_sys_select.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.89 ± 8% -1.1 1.80 ± 9% perf-profile.calltrace.cycles-pp.do_select.core_sys_select.kern_select.__x64_sys_select.do_syscall_64 5.18 ± 9% -1.0 4.17 ± 9% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.timerfd_read.vfs_read.ksys_read 4.68 ± 9% -1.0 3.72 ± 9% perf-profile.calltrace.cycles-pp.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.timerfd_read.vfs_read 4.64 ± 9% -1.0 3.69 ± 9% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.timerfd_read 1.68 ± 9% -0.5 1.14 ± 10% perf-profile.calltrace.cycles-pp.timerfd_poll.do_select.core_sys_select.kern_select.__x64_sys_select 0.77 ± 11% -0.4 0.40 ± 57% perf-profile.calltrace.cycles-pp.timerfd_tmrproc.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack 1.41 ± 5% +0.3 1.70 ± 8% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 1.46 ± 5% +0.3 1.79 ± 9% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 1.63 ± 5% +0.4 2.02 ± 9% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 1.71 ± 4% +0.4 2.14 ± 10% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt 0.26 ±100% +0.4 0.70 ± 10% perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 0.27 ±100% +0.6 0.84 ± 8% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.do_timerfd_gettime.__x64_sys_timerfd_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 3.22 ± 7% -2.6 0.62 ± 7% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore 2.97 ± 8% -1.1 1.86 ± 9% perf-profile.children.cycles-pp.core_sys_select 2.94 ± 8% -1.1 1.83 ± 9% perf-profile.children.cycles-pp.do_select 3.00 ± 8% -1.1 1.90 ± 9% perf-profile.children.cycles-pp.kern_select 3.00 ± 8% -1.1 1.90 ± 9% perf-profile.children.cycles-pp.__x64_sys_select 1.70 ± 9% -0.5 1.17 ± 11% perf-profile.children.cycles-pp.timerfd_poll 2.65 ± 10% -0.4 2.20 ± 10% perf-profile.children.cycles-pp.__fget_light 2.25 ± 10% -0.4 1.83 ± 9% perf-profile.children.cycles-pp.timerfd_tmrproc 0.30 ± 5% +0.1 0.41 ± 11% perf-profile.children.cycles-pp.sync_regs 3.19 ± 8% -2.6 0.58 ± 8% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore 2.04 ± 10% -0.5 1.58 ± 9% perf-profile.self.cycles-pp.__fget_light 0.18 ± 6% -0.0 0.13 ± 18% perf-profile.self.cycles-pp.tick_program_event 0.30 ± 4% +0.1 0.40 ± 11% perf-profile.self.cycles-pp.sync_regs 0.80 ± 8% +0.2 1.00 ± 7% perf-profile.self.cycles-pp.do_timerfd_gettime stress-ng.timerfd.ops_per_sec 1e+07 +-----------------------------------------------------------------+ 9.5e+06 |-+ O O OO OO O O O O O O OO O O | | O OO O O O O O O O O O O O O O O O | 9e+06 |.+.++.+.+.++.+.+.+.++.+.+.+.+ +.+.++.+.+.+.++.+.+.+.++.+ | 8.5e+06 |-+ : : | | : : | 8e+06 |-+ : : | 7.5e+06 |-+ : : | 7e+06 |-+ :: | | :: | 6.5e+06 |-+ :: | 6e+06 |-+ : | | : | 5.5e+06 |-+ + O O | 5e+06 +-----------------------------------------------------------------+ stress-ng.time.user_time 300 +---------------------------------------------------------------------+ | | 250 |-+ O O | | + | | : | 200 |-+ : | | : : | 150 |-+ : : | | : : | 100 |-+ : : | | : : | | O O O OO O O O O O O OO O : O : O O O O OO O O O O O OO O O O | 50 |.+.+.+.++.+.+.+.+.+.+.+.++.+.+ +.+.+.+.+.++.+.+.+.+.+.+.+.+ | | | 0 +---------------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Oliver Sang View attachment "config-5.11.0-rc7-00005-gab234a260b1f" of type "text/plain" (174007 bytes) View attachment "job-script" of type "text/plain" (8123 bytes) View attachment "job.yaml" of type "text/plain" (5625 bytes) View attachment "reproduce" of type "text/plain" (535 bytes)
Powered by blists - more mailing lists