[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202501261509.b6b4260d-lkp@intel.com>
Date: Sun, 26 Jan 2025 16:16:04 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Al Viro <viro@...iv.linux.org.uk>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Christian Brauner <brauner@...nel.org>, <linux-fsdevel@...r.kernel.org>,
<oliver.sang@...el.com>
Subject: [linus:master] [do_pollfd()] 8935989798:
will-it-scale.per_process_ops 11.7% regression
Hello,
kernel test robot noticed a 11.7% regression of will-it-scale.per_process_ops on:
commit: 89359897983825dbfc08578e7ee807aaf24d9911 ("do_pollfd(): convert to CLASS(fd)")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test faield on linus/master b46c89c08f4146e7987fc355941a93b12e2c03ef]
[test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183]
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:
nr_task: 100%
mode: process
test: poll2
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202501261509.b6b4260d-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250126/202501261509.b6b4260d-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale
commit:
d000e073ca ("convert do_select()")
8935989798 ("do_pollfd(): convert to CLASS(fd)")
d000e073ca2a08ab 89359897983825dbfc08578e7ee
---------------- ---------------------------
%stddev %change %stddev
\ | \
21281 ±147% +197.5% 63313 ± 84% numa-meminfo.node0.Shmem
5318 ±147% +197.5% 15825 ± 84% numa-vmstat.node0.nr_shmem
27370126 -11.7% 24170828 will-it-scale.104.processes
263173 -11.7% 232411 will-it-scale.per_process_ops
27370126 -11.7% 24170828 will-it-scale.workload
0.12 ± 16% -42.1% 0.07 ± 42% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
4.33 ± 28% +154.2% 11.02 ± 61% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
268.62 ± 53% -61.2% 104.10 ±114% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
1053 ± 6% -17.1% 873.33 ± 15% perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
1687 ± 10% +11.7% 1884 ± 6% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
3519 ± 4% +11.2% 3913 ± 5% perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
8.67 ± 28% +154.2% 22.04 ± 61% perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
268.45 ± 53% -61.4% 103.72 ±115% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
4.33 ± 28% +154.2% 11.02 ± 61% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
0.01 ± 2% +10.0% 0.01 perf-stat.i.MPKI
5.157e+10 -11.7% 4.554e+10 perf-stat.i.branch-instructions
1.573e+08 -11.8% 1.387e+08 perf-stat.i.branch-misses
0.97 +13.1% 1.09 perf-stat.i.cpi
2.9e+11 -11.7% 2.561e+11 perf-stat.i.instructions
1.04 -11.7% 0.91 perf-stat.i.ipc
0.00 ± 2% +17.9% 0.00 perf-stat.overall.MPKI
0.96 +13.2% 1.09 perf-stat.overall.cpi
1.04 -11.7% 0.92 perf-stat.overall.ipc
5.14e+10 -11.7% 4.538e+10 perf-stat.ps.branch-instructions
1.567e+08 -11.8% 1.382e+08 perf-stat.ps.branch-misses
2.891e+11 -11.7% 2.552e+11 perf-stat.ps.instructions
8.743e+13 -11.7% 7.724e+13 perf-stat.total.instructions
7.61 -0.6 7.03 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__poll
6.16 -0.5 5.66 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__poll
5.11 ± 2% -0.5 4.62 ± 2% perf-profile.calltrace.cycles-pp.testcase
2.92 ± 2% -0.4 2.55 ± 2% perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.91 -0.3 2.60 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__poll
1.92 ± 5% -0.3 1.67 ± 4% perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
2.12 -0.2 1.91 perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.32 -0.2 1.17 perf-profile.calltrace.cycles-pp.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.84 -0.1 1.72 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__poll
0.98 -0.1 0.88 ± 2% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64
0.97 -0.1 0.88 perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.72 -0.1 0.66 perf-profile.calltrace.cycles-pp.__virt_addr_valid.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll
0.62 -0.1 0.57 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
94.36 +0.5 94.89 perf-profile.calltrace.cycles-pp.__poll
75.76 +2.0 77.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
71.45 +2.4 73.83 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
69.72 +2.5 72.24 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
69.19 +2.6 71.77 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
54.05 +4.1 58.18 perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
38.56 +4.5 43.08 perf-profile.calltrace.cycles-pp.fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
7.68 -0.6 7.10 perf-profile.children.cycles-pp.syscall_return_via_sysret
6.61 -0.6 6.06 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
5.12 ± 2% -0.5 4.64 ± 2% perf-profile.children.cycles-pp.testcase
3.15 ± 2% -0.4 2.74 ± 2% perf-profile.children.cycles-pp._copy_from_user
3.70 -0.4 3.33 perf-profile.children.cycles-pp.entry_SYSCALL_64
1.94 ± 4% -0.3 1.69 ± 4% perf-profile.children.cycles-pp.rep_movs_alternative
2.26 -0.2 2.04 perf-profile.children.cycles-pp.__check_object_size
1.35 -0.2 1.19 perf-profile.children.cycles-pp.__kmalloc_noprof
1.04 -0.1 0.94 perf-profile.children.cycles-pp.check_heap_object
0.97 -0.1 0.88 perf-profile.children.cycles-pp.kfree
1.07 -0.1 1.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.74 -0.1 0.66 perf-profile.children.cycles-pp.__virt_addr_valid
0.57 -0.1 0.50 perf-profile.children.cycles-pp.__check_heap_object
0.63 -0.0 0.58 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.22 ± 2% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.check_stack_object
0.18 ± 3% -0.0 0.16 perf-profile.children.cycles-pp.__cond_resched
0.07 ± 6% -0.0 0.06 perf-profile.children.cycles-pp.is_vmalloc_addr
0.13 -0.0 0.12 ± 3% perf-profile.children.cycles-pp.x64_sys_call
0.34 -0.0 0.33 perf-profile.children.cycles-pp.__hrtimer_run_queues
0.12 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.rcu_all_qs
94.98 +0.5 95.45 perf-profile.children.cycles-pp.__poll
75.89 +2.0 77.89 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
71.52 +2.4 73.89 perf-profile.children.cycles-pp.do_syscall_64
69.78 +2.5 72.29 perf-profile.children.cycles-pp.__x64_sys_poll
69.28 +2.6 71.85 perf-profile.children.cycles-pp.do_sys_poll
54.18 +4.1 58.28 perf-profile.children.cycles-pp.do_poll
38.44 +4.6 43.00 perf-profile.children.cycles-pp.fdget
7.24 -0.6 6.60 perf-profile.self.cycles-pp.do_sys_poll
7.68 -0.6 7.09 perf-profile.self.cycles-pp.syscall_return_via_sysret
16.95 -0.6 16.39 perf-profile.self.cycles-pp.do_poll
6.55 -0.5 6.00 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
4.93 ± 2% -0.5 4.46 ± 2% perf-profile.self.cycles-pp.testcase
4.46 -0.4 4.06 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
3.25 -0.3 2.92 perf-profile.self.cycles-pp.entry_SYSCALL_64
1.78 ± 5% -0.2 1.54 ± 4% perf-profile.self.cycles-pp.rep_movs_alternative
1.34 -0.2 1.18 perf-profile.self.cycles-pp._copy_from_user
1.16 -0.1 1.02 ± 2% perf-profile.self.cycles-pp.__kmalloc_noprof
0.96 -0.1 0.87 perf-profile.self.cycles-pp.kfree
0.68 -0.1 0.61 ± 2% perf-profile.self.cycles-pp.__virt_addr_valid
0.56 -0.1 0.50 perf-profile.self.cycles-pp.__check_heap_object
0.43 -0.0 0.39 perf-profile.self.cycles-pp.__x64_sys_poll
0.49 -0.0 0.45 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.29 ± 2% -0.0 0.26 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.19 -0.0 0.17 ± 3% perf-profile.self.cycles-pp.check_stack_object
0.26 -0.0 0.25 ± 3% perf-profile.self.cycles-pp.check_heap_object
0.12 -0.0 0.11 ± 3% perf-profile.self.cycles-pp.x64_sys_call
36.98 +4.6 41.62 perf-profile.self.cycles-pp.fdget
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists