[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z5ilYwlw9+8/9N3U@xsang-OptiPlex-9020>
Date: Tue, 28 Jan 2025 17:37:39 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Al Viro <viro@...iv.linux.org.uk>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Christian Brauner <brauner@...nel.org>, <linux-fsdevel@...r.kernel.org>,
<oliver.sang@...el.com>
Subject: Re: [linus:master] [do_pollfd()] 8935989798:
will-it-scale.per_process_ops 11.7% regression
hi, Al Viro,
On Mon, Jan 27, 2025 at 07:26:16PM +0000, Al Viro wrote:
> On Sun, Jan 26, 2025 at 04:16:04PM +0800, kernel test robot wrote:
> >
> >
> > Hello,
> >
> > kernel test robot noticed a 11.7% regression of will-it-scale.per_process_ops on:
> >
> >
> > commit: 89359897983825dbfc08578e7ee807aaf24d9911 ("do_pollfd(): convert to CLASS(fd)")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > [test faield on linus/master b46c89c08f4146e7987fc355941a93b12e2c03ef]
> > [test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183]
> >
> > testcase: will-it-scale
> > config: x86_64-rhel-9.4
> > compiler: gcc-12
> > test machine: 104 threads 2 sockets (Skylake) with 192G memory
> > parameters:
> >
> > nr_task: 100%
> > mode: process
> > test: poll2
> > cpufreq_governor: performance
> >
> >
> >
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@...el.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202501261509.b6b4260d-lkp@intel.com
> >
> >
> > Details are as below:
> > -------------------------------------------------------------------------------------------------->
> >
> >
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20250126/202501261509.b6b4260d-lkp@intel.com
>
> Very interesting... Looking at the generated asm, two things seem to
> change in there- "we need an fput()" case in (now implicit) fdput() in
> do_pollfd() is no longer out of line and slightly different spills are
> done in do_poll().
>
> Just to make sure it's not a geniune change of logics somewhere,
> could you compare d000e073ca2a, 893598979838 and d000e073ca2a with the
> delta below? That delta provably is an equivalent transformation - all
> exits from do_pollfd() go through the return in the end, so that just
> shifts the last assignment in there into the caller.
the 'd000e073ca2a with the delta below' has just very similar score as
d000e073ca2a as below.
Tested-by: kernel test robot <oliver.sang@...el.com>
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale
commit:
d000e073ca ("convert do_select()")
8935989798 ("do_pollfd(): convert to CLASS(fd)")
2c43a225261 <--- d000e073ca with the delta below
d000e073ca2a08ab 89359897983825dbfc08578e7ee 2c43a2252614bf1692ef2ad5a46
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
263173 -11.7% 232411 -0.5% 261953 will-it-scale.per_process_ops
below full comparison FYI.
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale
commit:
d000e073ca ("convert do_select()")
8935989798 ("do_pollfd(): convert to CLASS(fd)")
2c43a225261 <--- d000e073ca with the delta below
d000e073ca2a08ab 89359897983825dbfc08578e7ee 2c43a2252614bf1692ef2ad5a46
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
1.98e+08 ± 12% +15.7% 2.29e+08 ± 18% -13.1% 1.721e+08 cpuidle..time
21281 ±147% +197.5% 63313 ± 84% +180.7% 59731 ± 86% numa-meminfo.node0.Shmem
5318 ±147% +197.5% 15825 ± 84% +180.7% 14930 ± 86% numa-vmstat.node0.nr_shmem
88607 +0.2% 88803 -1.5% 87297 proc-vmstat.nr_shmem
11118 ± 15% +13.6% 12633 ± 51% -27.7% 8034 ± 10% proc-vmstat.numa_hint_faults_local
21894 ± 4% +135.8% 51630 ±124% +144.5% 53539 ±117% sched_debug.cfs_rq:/.load.max
2575 ± 4% +106.7% 5323 ±112% +115.5% 5548 ±106% sched_debug.cfs_rq:/.load.stddev
3940 ± 18% -19.1% 3188 ± 8% -25.5% 2933 ± 20% sched_debug.cpu.avg_idle.min
27370126 -11.7% 24170828 -0.5% 27243222 will-it-scale.104.processes
263173 -11.7% 232411 -0.5% 261953 will-it-scale.per_process_ops
27370126 -11.7% 24170828 -0.5% 27243222 will-it-scale.workload
0.12 ± 16% -42.1% 0.07 ± 42% -36.3% 0.07 ± 35% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
4.33 ± 28% +154.2% 11.02 ± 61% +86.2% 8.07 ± 83% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
2.27 ± 22% -34.2% 1.49 ± 66% -48.9% 1.16 ± 36% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
268.62 ± 53% -61.2% 104.10 ±114% -39.4% 162.90 ± 82% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
1053 ± 6% -17.1% 873.33 ± 15% -4.6% 1004 ± 11% perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
1687 ± 10% +11.7% 1884 ± 6% +5.3% 1777 ± 10% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
3519 ± 4% +11.2% 3913 ± 5% +3.9% 3656 ± 5% perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
8.67 ± 28% +154.2% 22.04 ± 61% +86.2% 16.14 ± 83% perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
268.45 ± 53% -61.4% 103.72 ±115% -39.5% 162.49 ± 83% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
4.33 ± 28% +154.2% 11.02 ± 61% +86.2% 8.07 ± 83% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
0.01 ± 2% +10.0% 0.01 +0.7% 0.01 ± 2% perf-stat.i.MPKI
5.157e+10 -11.7% 4.554e+10 -0.5% 5.133e+10 perf-stat.i.branch-instructions
1.573e+08 -11.8% 1.387e+08 +0.0% 1.573e+08 perf-stat.i.branch-misses
0.97 +13.1% 1.09 +0.2% 0.97 perf-stat.i.cpi
2.9e+11 -11.7% 2.561e+11 -0.5% 2.887e+11 perf-stat.i.instructions
1.04 -11.7% 0.91 -0.2% 1.03 perf-stat.i.ipc
0.00 ± 2% +17.9% 0.00 +1.4% 0.00 ± 3% perf-stat.overall.MPKI
0.96 +13.2% 1.09 +0.2% 0.97 perf-stat.overall.cpi
1.04 -11.7% 0.92 -0.2% 1.03 perf-stat.overall.ipc
5.14e+10 -11.7% 4.538e+10 -0.5% 5.116e+10 perf-stat.ps.branch-instructions
1.567e+08 -11.8% 1.382e+08 +0.0% 1.568e+08 perf-stat.ps.branch-misses
2.891e+11 -11.7% 2.552e+11 -0.5% 2.877e+11 perf-stat.ps.instructions
8.743e+13 -11.7% 7.724e+13 -0.5% 8.699e+13 perf-stat.total.instructions
7.61 -0.6 7.03 +0.0 7.63 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__poll
6.16 -0.5 5.66 -0.0 6.13 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__poll
5.11 ± 2% -0.5 4.62 ± 2% +0.3 5.44 perf-profile.calltrace.cycles-pp.testcase
2.92 ± 2% -0.4 2.55 ± 2% -0.1 2.85 perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.91 -0.3 2.60 +0.0 2.93 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__poll
1.92 ± 5% -0.3 1.67 ± 4% -0.1 1.84 perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
2.12 -0.2 1.91 -0.0 2.10 perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.32 -0.2 1.17 -0.0 1.30 perf-profile.calltrace.cycles-pp.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.84 -0.1 1.72 +0.0 1.85 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__poll
0.98 -0.1 0.88 ± 2% -0.0 0.97 perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64
0.97 -0.1 0.88 -0.0 0.94 ± 4% perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.72 -0.1 0.66 -0.0 0.72 perf-profile.calltrace.cycles-pp.__virt_addr_valid.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll
0.62 -0.1 0.57 +0.0 0.62 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
94.36 +0.5 94.89 -0.3 94.03 perf-profile.calltrace.cycles-pp.__poll
75.76 +2.0 77.76 -0.3 75.45 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
71.45 +2.4 73.83 -0.3 71.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
69.72 +2.5 72.24 -0.4 69.32 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
69.19 +2.6 71.77 -0.4 68.80 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
54.05 +4.1 58.18 -0.2 53.85 perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
38.56 +4.5 43.08 -0.2 38.35 perf-profile.calltrace.cycles-pp.fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
7.68 -0.6 7.10 +0.0 7.70 perf-profile.children.cycles-pp.syscall_return_via_sysret
6.61 -0.6 6.06 -0.0 6.59 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
5.12 ± 2% -0.5 4.64 ± 2% +0.3 5.45 perf-profile.children.cycles-pp.testcase
3.15 ± 2% -0.4 2.74 ± 2% -0.1 3.07 perf-profile.children.cycles-pp._copy_from_user
3.70 -0.4 3.33 +0.0 3.72 perf-profile.children.cycles-pp.entry_SYSCALL_64
1.94 ± 4% -0.3 1.69 ± 4% -0.1 1.86 perf-profile.children.cycles-pp.rep_movs_alternative
2.26 -0.2 2.04 -0.0 2.25 perf-profile.children.cycles-pp.__check_object_size
1.35 -0.2 1.19 -0.0 1.33 perf-profile.children.cycles-pp.__kmalloc_noprof
1.04 -0.1 0.94 -0.0 1.04 perf-profile.children.cycles-pp.check_heap_object
0.97 -0.1 0.88 -0.0 0.94 ± 4% perf-profile.children.cycles-pp.kfree
1.07 -0.1 1.00 +0.0 1.08 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.74 -0.1 0.66 -0.0 0.73 perf-profile.children.cycles-pp.__virt_addr_valid
0.57 -0.1 0.50 -0.0 0.56 ± 2% perf-profile.children.cycles-pp.__check_heap_object
0.63 -0.0 0.58 +0.0 0.63 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.22 ± 2% -0.0 0.20 ± 2% +0.0 0.23 ± 3% perf-profile.children.cycles-pp.check_stack_object
0.18 ± 3% -0.0 0.16 -0.0 0.17 ± 2% perf-profile.children.cycles-pp.__cond_resched
0.07 ± 6% -0.0 0.06 -0.0 0.07 ± 5% perf-profile.children.cycles-pp.is_vmalloc_addr
0.13 -0.0 0.12 ± 3% +0.0 0.13 perf-profile.children.cycles-pp.x64_sys_call
0.34 -0.0 0.33 -0.0 0.34 ± 2% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.12 ± 3% -0.0 0.11 -0.0 0.12 ± 3% perf-profile.children.cycles-pp.rcu_all_qs
94.98 +0.5 95.45 -0.3 94.65 perf-profile.children.cycles-pp.__poll
75.89 +2.0 77.89 -0.3 75.58 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
71.52 +2.4 73.89 -0.3 71.19 perf-profile.children.cycles-pp.do_syscall_64
69.78 +2.5 72.29 -0.4 69.38 perf-profile.children.cycles-pp.__x64_sys_poll
69.28 +2.6 71.85 -0.4 68.89 perf-profile.children.cycles-pp.do_sys_poll
54.18 +4.1 58.28 -0.2 53.99 perf-profile.children.cycles-pp.do_poll
38.44 +4.6 43.00 -0.2 38.24 perf-profile.children.cycles-pp.fdget
7.24 -0.6 6.60 -0.1 7.19 perf-profile.self.cycles-pp.do_sys_poll
7.68 -0.6 7.09 +0.0 7.70 perf-profile.self.cycles-pp.syscall_return_via_sysret
16.95 -0.6 16.39 +0.0 16.96 perf-profile.self.cycles-pp.do_poll
6.55 -0.5 6.00 -0.0 6.52 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
4.93 ± 2% -0.5 4.46 ± 2% +0.3 5.26 perf-profile.self.cycles-pp.testcase
4.46 -0.4 4.06 +0.0 4.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
3.25 -0.3 2.92 +0.0 3.27 perf-profile.self.cycles-pp.entry_SYSCALL_64
1.78 ± 5% -0.2 1.54 ± 4% -0.1 1.70 perf-profile.self.cycles-pp.rep_movs_alternative
1.34 -0.2 1.18 -0.0 1.33 perf-profile.self.cycles-pp._copy_from_user
1.16 -0.1 1.02 ± 2% -0.0 1.15 perf-profile.self.cycles-pp.__kmalloc_noprof
0.96 -0.1 0.87 -0.0 0.93 ± 4% perf-profile.self.cycles-pp.kfree
0.68 -0.1 0.61 ± 2% -0.0 0.68 perf-profile.self.cycles-pp.__virt_addr_valid
0.56 -0.1 0.50 -0.0 0.55 perf-profile.self.cycles-pp.__check_heap_object
0.43 -0.0 0.39 -0.0 0.43 perf-profile.self.cycles-pp.__x64_sys_poll
0.49 -0.0 0.45 -0.0 0.49 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.29 ± 2% -0.0 0.26 ± 3% +0.0 0.30 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.19 -0.0 0.17 ± 3% +0.0 0.19 ± 2% perf-profile.self.cycles-pp.check_stack_object
0.26 -0.0 0.25 ± 3% -0.0 0.26 ± 2% perf-profile.self.cycles-pp.check_heap_object
0.12 -0.0 0.11 ± 3% +0.0 0.12 perf-profile.self.cycles-pp.x64_sys_call
36.98 +4.6 41.62 -0.2 36.77 perf-profile.self.cycles-pp.fdget
>
> diff --git a/fs/select.c b/fs/select.c
> index b41e2d651cc1..e0c816fa4ec4 100644
> --- a/fs/select.c
> +++ b/fs/select.c
> @@ -875,8 +875,6 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait,
> fdput(f);
>
> out:
> - /* ... and so does ->revents */
> - pollfd->revents = mangle_poll(mask);
> return mask;
> }
>
> @@ -909,6 +907,7 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
> pfd = walk->entries;
> pfd_end = pfd + walk->len;
> for (; pfd != pfd_end; pfd++) {
> + __poll_t mask;
> /*
> * Fish for events. If we found one, record it
> * and kill poll_table->_qproc, so we don't
> @@ -916,8 +915,9 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
> * this. They'll get immediately deregistered
> * when we break out and return.
> */
> - if (do_pollfd(pfd, pt, &can_busy_loop,
> - busy_flag)) {
> + mask = do_pollfd(pfd, pt, &can_busy_loop, busy_flag);
> + pfd->revents = mangle_poll(mask);
> + if (mask) {
> count++;
> pt->_qproc = NULL;
> /* found something, stop busy polling */
>
Powered by blists - more mailing lists