lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 10 Dec 2020 16:18:59 +0800 From: kernel test robot <oliver.sang@...el.com> To: Peter Zijlstra <peterz@...radead.org> Cc: Valentin Schneider <valentin.schneider@....com>, Daniel Bristot de Oliveira <bristot@...hat.com>, LKML <linux-kernel@...r.kernel.org>, x86@...nel.org, lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com, aubrey.li@...ux.intel.com, yu.c.chen@...el.com Subject: [sched/hotplug] 2558aacff8: will-it-scale.per_thread_ops -1.6% regression Greeting, FYI, we noticed a -1.6% regression of will-it-scale.per_thread_ops due to commit: commit: 2558aacff8586699bcd248b406febb28b0a25de2 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug") https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/migrate-disable in testcase: will-it-scale on test machine: 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory with following parameters: nr_task: 100% mode: thread test: sched_yield cpufreq_governor: performance ucode: 0x700001e test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale If you fix the issue, kindly add following tag Reported-by: kernel test robot <oliver.sang@...el.com> Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode: gcc-9/performance/x86_64-rhel-8.3/thread/100%/debian-10.4-x86_64-20200603.cgz/lkp-cpl-4sp1/sched_yield/will-it-scale/0x700001e commit: 565790d28b ("sched: Fix balance_callback()") 2558aacff8 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug") 565790d28b1e33ee 2558aacff8586699bcd248b406f ---------------- --------------------------- fail:runs %reproduction fail:runs | | | 1:4 0% 1:4 perf-profile.children.cycles-pp.error_entry 0:4 0% 0:4 perf-profile.self.cycles-pp.error_entry %stddev %change %stddev \ | \ 4.011e+08 -1.6% 3.945e+08 will-it-scale.144.threads 2785455 -1.6% 2739520 will-it-scale.per_thread_ops 4.011e+08 -1.6% 3.945e+08 will-it-scale.workload 12.05 +2.1 14.18 mpstat.cpu.all.usr% 1087711 ± 75% -79.0% 228885 ± 7% numa-numastat.node1.local_node 1126029 ± 74% -74.5% 286894 ± 6% numa-numastat.node1.numa_hit 33836 -2.3% 33042 proc-vmstat.nr_slab_reclaimable 74433 -1.5% 73345 proc-vmstat.nr_slab_unreclaimable 86.25 -2.3% 84.25 vmstat.cpu.sy 11.75 ± 3% +17.0% 13.75 ± 3% vmstat.cpu.us 333551 ± 17% -21.7% 261115 ± 5% vmstat.system.cs 329071 ± 3% -15.4% 278535 ± 4% sched_debug.cfs_rq:/.spread0.avg 472614 ± 2% -11.0% 420678 ± 2% sched_debug.cfs_rq:/.spread0.max 17597663 ± 17% -28.5% 12582107 ± 16% sched_debug.cpu.nr_switches.max 1897476 ± 17% -28.4% 1359264 ± 14% sched_debug.cpu.nr_switches.stddev 5628 ± 8% -10.9% 5012 ± 3% slabinfo.files_cache.active_objs 5628 ± 8% -10.9% 5012 ± 3% slabinfo.files_cache.num_objs 3613 ± 2% -10.9% 3219 slabinfo.kmalloc-rcl-512.active_objs 3644 ± 2% -10.9% 3248 slabinfo.kmalloc-rcl-512.num_objs 3967 ± 4% -8.3% 3638 ± 2% slabinfo.sock_inode_cache.active_objs 3967 ± 4% -8.3% 3638 ± 2% slabinfo.sock_inode_cache.num_objs 0.02 ± 9% -14.5% 0.02 ± 2% perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork 14.28 ± 38% +48.9% 21.26 ± 24% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork 0.02 ± 24% +38.2% 0.03 ± 14% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll 0.04 ± 13% -22.8% 0.03 ± 14% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork 47.71 ± 30% +41.6% 67.54 ± 11% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork 3.20 ± 33% -81.5% 0.59 ± 91% perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 33.43 ± 27% +38.4% 46.27 ± 8% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork 0.05 ± 43% -68.9% 0.02 ± 63% perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.path_openat 8.23 ± 10% -57.9% 3.47 ± 97% perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 35211 ±169% -99.6% 129.50 ± 96% numa-vmstat.node1.nr_active_anon 8060 ± 17% -35.0% 5236 ± 31% numa-vmstat.node1.nr_slab_reclaimable 35211 ±169% -99.6% 129.50 ± 96% numa-vmstat.node1.nr_zone_active_anon 1053733 ± 53% -52.7% 498416 ± 7% numa-vmstat.node1.numa_hit 946160 ± 58% -62.7% 352475 ± 12% numa-vmstat.node1.numa_local 107572 ± 23% +35.7% 145940 ± 5% numa-vmstat.node1.numa_other 5914 ± 23% -28.5% 4226 ± 10% numa-vmstat.node2.nr_slab_reclaimable 18204 ± 3% -14.7% 15522 ± 12% numa-vmstat.node2.nr_slab_unreclaimable 629428 ± 9% -14.2% 540085 ± 7% numa-vmstat.node2.numa_hit 17302 ± 10% +26.0% 21807 ± 7% numa-vmstat.node3.nr_slab_unreclaimable 140785 ±169% -99.6% 520.75 ± 95% numa-meminfo.node1.Active 140785 ±169% -99.6% 520.75 ± 95% numa-meminfo.node1.Active(anon) 32241 ± 17% -35.0% 20948 ± 31% numa-meminfo.node1.KReclaimable 32241 ± 17% -35.0% 20948 ± 31% numa-meminfo.node1.SReclaimable 101007 ± 5% -15.7% 85162 ± 13% numa-meminfo.node1.Slab 23657 ± 23% -28.5% 16906 ± 10% numa-meminfo.node2.KReclaimable 23657 ± 23% -28.5% 16906 ± 10% numa-meminfo.node2.SReclaimable 72823 ± 3% -14.7% 62089 ± 12% numa-meminfo.node2.SUnreclaim 96481 ± 3% -18.1% 78996 ± 12% numa-meminfo.node2.Slab 69210 ± 10% +26.0% 87229 ± 7% numa-meminfo.node3.SUnreclaim 110579 ± 9% +26.7% 140158 ± 10% numa-meminfo.node3.Slab 388.75 ± 74% +1147.8% 4851 ±124% interrupts.33:PCI-MSI.524291-edge.eth0-TxRx-2 1540 ± 69% -78.2% 335.75 ± 51% interrupts.34:PCI-MSI.524292-edge.eth0-TxRx-3 388.75 ± 74% +1147.8% 4851 ±124% interrupts.CPU11.33:PCI-MSI.524291-edge.eth0-TxRx-2 307.50 +66.7% 512.50 ± 50% interrupts.CPU111.RES:Rescheduling_interrupts 1540 ± 69% -78.2% 335.75 ± 51% interrupts.CPU12.34:PCI-MSI.524292-edge.eth0-TxRx-3 350.50 ± 8% -10.3% 314.50 ± 2% interrupts.CPU122.RES:Rescheduling_interrupts 424.50 ± 24% -18.7% 345.00 ± 14% interrupts.CPU128.RES:Rescheduling_interrupts 8496 -50.1% 4241 interrupts.CPU29.NMI:Non-maskable_interrupts 8496 -50.1% 4241 interrupts.CPU29.PMI:Performance_monitoring_interrupts 314.25 +8.7% 341.50 ± 4% interrupts.CPU29.RES:Rescheduling_interrupts 8496 -50.1% 4242 interrupts.CPU30.NMI:Non-maskable_interrupts 8496 -50.1% 4242 interrupts.CPU30.PMI:Performance_monitoring_interrupts 311.50 +13.2% 352.75 ± 8% interrupts.CPU7.RES:Rescheduling_interrupts 21144 ± 15% -25.0% 15858 ± 24% interrupts.CPU72.CAL:Function_call_interrupts 317.75 +39.2% 442.25 ± 32% interrupts.CPU82.RES:Rescheduling_interrupts 8.557e+10 -1.8% 8.399e+10 perf-stat.i.branch-instructions 0.43 +0.4 0.87 perf-stat.i.branch-miss-rate% 3.479e+08 +106.5% 7.186e+08 perf-stat.i.branch-misses 333383 ± 17% -22.1% 259849 ± 6% perf-stat.i.context-switches 1.02 +2.4% 1.05 perf-stat.i.cpi 1.268e+11 -1.9% 1.243e+11 perf-stat.i.dTLB-loads 7.506e+10 -1.9% 7.363e+10 perf-stat.i.dTLB-stores 4.26e+08 ± 2% -31.3% 2.925e+08 perf-stat.i.iTLB-load-misses 538538 ± 36% -79.5% 110207 ± 16% perf-stat.i.iTLB-loads 3.983e+11 -1.9% 3.908e+11 perf-stat.i.instructions 946.16 ± 3% +43.4% 1356 perf-stat.i.instructions-per-iTLB-miss 0.99 -2.4% 0.97 perf-stat.i.ipc 1.22 ± 3% +30.8% 1.60 ± 3% perf-stat.i.metric.K/sec 1996 -1.9% 1958 perf-stat.i.metric.M/sec 0.41 +0.4 0.86 perf-stat.overall.branch-miss-rate% 1.01 +2.5% 1.03 perf-stat.overall.cpi 935.95 ± 2% +42.8% 1336 perf-stat.overall.instructions-per-iTLB-miss 0.99 -2.4% 0.97 perf-stat.overall.ipc 8.527e+10 -1.8% 8.37e+10 perf-stat.ps.branch-instructions 3.467e+08 +106.5% 7.161e+08 perf-stat.ps.branch-misses 334637 ± 17% -21.5% 262794 ± 5% perf-stat.ps.context-switches 1.264e+11 -1.9% 1.239e+11 perf-stat.ps.dTLB-loads 7.48e+10 -1.9% 7.338e+10 perf-stat.ps.dTLB-stores 4.244e+08 ± 2% -31.3% 2.915e+08 perf-stat.ps.iTLB-load-misses 539519 ± 36% -79.5% 110644 ± 16% perf-stat.ps.iTLB-loads 3.969e+11 -1.9% 3.895e+11 perf-stat.ps.instructions 1.2e+14 -2.0% 1.176e+14 perf-stat.total.instructions 0.68 ± 2% -0.1 0.59 ± 3% perf-profile.calltrace.cycles-pp.orc_find.unwind_next_frame.perf_callchain_kernel.get_perf_callchain.perf_callchain 0.93 -0.1 0.88 ± 3% perf-profile.calltrace.cycles-pp.__perf_event_header__init_id.perf_prepare_sample.perf_event_output_forward.__perf_event_overflow.perf_swevent_overflow 1.22 +0.1 1.30 ± 2% perf-profile.calltrace.cycles-pp.__orc_find.unwind_next_frame.perf_callchain_kernel.get_perf_callchain.perf_callchain 1.11 +0.1 1.21 ± 2% perf-profile.calltrace.cycles-pp.orc_find.unwind_next_frame.__unwind_start.perf_callchain_kernel.get_perf_callchain 1.51 -0.1 1.44 ± 3% perf-profile.children.cycles-pp.stack_access_ok 0.37 ± 3% -0.1 0.30 perf-profile.children.cycles-pp.__task_pid_nr_ns 0.47 ± 2% -0.1 0.41 ± 2% perf-profile.children.cycles-pp.perf_event_pid_type 0.30 ± 5% -0.0 0.25 ± 12% perf-profile.children.cycles-pp.__list_del_entry_valid 0.95 -0.0 0.90 ± 3% perf-profile.children.cycles-pp.__perf_event_header__init_id 0.10 ± 14% -0.0 0.07 ± 31% perf-profile.children.cycles-pp.sched_yield@plt 0.42 -0.0 0.38 ± 5% perf-profile.children.cycles-pp.ftrace_graph_ret_addr 0.10 ± 4% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.is_module_text_address 0.11 ± 4% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.is_ftrace_trampoline 0.10 ± 4% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.ftrace_ops_trampoline 0.06 ± 15% +0.0 0.09 ± 7% perf-profile.children.cycles-pp.rcu_qs 0.23 ± 8% +0.1 0.31 ± 7% perf-profile.children.cycles-pp.rcu_note_context_switch 0.35 ± 4% -0.1 0.29 perf-profile.self.cycles-pp.__task_pid_nr_ns 1.24 -0.1 1.18 ± 2% perf-profile.self.cycles-pp.stack_access_ok 0.23 ± 5% -0.0 0.18 ± 12% perf-profile.self.cycles-pp.__list_del_entry_valid 0.34 ± 4% -0.0 0.30 ± 2% perf-profile.self.cycles-pp.perf_tp_event 0.12 ± 4% -0.0 0.10 ± 10% perf-profile.self.cycles-pp.sched_clock_cpu 0.32 -0.0 0.30 ± 5% perf-profile.self.cycles-pp.ftrace_graph_ret_addr 0.08 -0.0 0.06 ± 6% perf-profile.self.cycles-pp.ftrace_ops_trampoline 0.34 -0.0 0.33 ± 2% perf-profile.self.cycles-pp.unwind_get_return_address 0.08 +0.0 0.10 ± 7% perf-profile.self.cycles-pp.rcu_is_watching 0.05 ± 8% +0.0 0.09 ± 7% perf-profile.self.cycles-pp.rcu_qs 0.51 +0.0 0.55 ± 3% perf-profile.self.cycles-pp.bsearch 0.16 ± 13% +0.0 0.20 ± 7% perf-profile.self.cycles-pp.rcu_note_context_switch 1.44 ± 4% +0.3 1.76 ± 12% perf-profile.self.cycles-pp.__sched_yield will-it-scale.per_thread_ops 3e+06 +-----------------------------------------------------------------+ | : O +.+.+.+.+.++.+.+.+.+.+ O O O O O O O O O O OO O O O O O O O | 2.5e+06 |-: : | | : : | | : : | 2e+06 |-+: : | | : : | 1.5e+06 |-+: : | | : : | 1e+06 |-+: : | | : : | | : | 500000 |-+ : | | : | 0 +-----------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Oliver Sang View attachment "config-5.10.0-rc1-00033-g2558aacff858" of type "text/plain" (171488 bytes) View attachment "job-script" of type "text/plain" (7947 bytes) View attachment "job.yaml" of type "text/plain" (5522 bytes) View attachment "reproduce" of type "text/plain" (343 bytes)
Powered by blists - more mailing lists