[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20211203020952.GB5881@xsang-OptiPlex-9020>
Date: Fri, 3 Dec 2021 10:09:52 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Nadav Amit <namit@...are.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...ux.intel.com, fengwei.yin@...el.com
Subject: [hugetlbfs] a4a118f2ee: will-it-scale.per_thread_ops -14.9%
regression
Greeting,
FYI, we noticed a -14.9% regression of will-it-scale.per_thread_ops due to commit:
commit: a4a118f2eead1d6c49e00765de89878288d4b890 ("hugetlbfs: flush TLBs correctly after huge_pmd_unshare")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 104 threads 2 sockets Skylake with 192G memory
with following parameters:
nr_task: 100%
mode: thread
test: context_switch1
cpufreq_governor: performance
ucode: 0x2006a0a
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/100%/debian-10.4-x86_64-20200603.cgz/lkp-skl-fpga01/context_switch1/will-it-scale/0x2006a0a
commit:
v5.16-rc2
a4a118f2ee ("hugetlbfs: flush TLBs correctly after huge_pmd_unshare")
v5.16-rc2 a4a118f2eead1d6c49e00765de8
---------------- ---------------------------
%stddev %change %stddev
\ | \
22094930 -14.9% 18801170 will-it-scale.104.threads
212450 -14.9% 180780 will-it-scale.per_thread_ops
22094930 -14.9% 18801170 will-it-scale.workload
104.51 +6.4% 111.15 turbostat.RAMWatt
21864416 -14.9% 18613340 vmstat.system.cs
1.61 ± 14% +42.6% 2.29 ± 11% perf-stat.i.MPKI
3.726e+10 -13.5% 3.224e+10 perf-stat.i.branch-instructions
5.173e+08 -14.1% 4.441e+08 perf-stat.i.branch-misses
1.71 ± 14% +8.5 10.23 ± 7% perf-stat.i.cache-miss-rate%
4566699 ± 12% +689.0% 36029296 ± 4% perf-stat.i.cache-misses
22042272 -14.9% 18767811 perf-stat.i.context-switches
1.52 +16.1% 1.76 perf-stat.i.cpi
170640 ± 18% -95.0% 8502 ± 4% perf-stat.i.cycles-between-cache-misses
44430650 -14.6% 37926361 perf-stat.i.dTLB-load-misses
5.32e+10 -13.6% 4.594e+10 perf-stat.i.dTLB-loads
0.00 ± 4% +0.0 0.00 ± 10% perf-stat.i.dTLB-store-miss-rate%
3.23e+10 -13.7% 2.786e+10 perf-stat.i.dTLB-stores
68025283 -21.9% 53120420 ± 2% perf-stat.i.iTLB-load-misses
1.836e+11 -13.5% 1.589e+11 perf-stat.i.instructions
2820 +9.5% 3089 ± 2% perf-stat.i.instructions-per-iTLB-miss
0.66 -13.2% 0.57 perf-stat.i.ipc
1183 -13.5% 1023 perf-stat.i.metric.M/sec
274656 ± 40% +535.1% 1744238 ± 8% perf-stat.i.node-load-misses
1.59 ± 13% +41.3% 2.25 ± 11% perf-stat.overall.MPKI
1.59 ± 16% +8.6 10.18 ± 8% perf-stat.overall.cache-miss-rate%
1.51 +15.3% 1.74 perf-stat.overall.cpi
61473 ± 10% -87.5% 7707 ± 4% perf-stat.overall.cycles-between-cache-misses
0.08 -0.0 0.08 perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 4% +0.0 0.00 ± 11% perf-stat.overall.dTLB-store-miss-rate%
2700 ± 2% +10.8% 2992 ± 2% perf-stat.overall.instructions-per-iTLB-miss
0.66 -13.3% 0.57 perf-stat.overall.ipc
32.91 ± 37% +37.6 70.48 ± 5% perf-stat.overall.node-load-miss-rate%
2504472 +1.7% 2546759 perf-stat.overall.path-length
3.714e+10 -13.5% 3.214e+10 perf-stat.ps.branch-instructions
5.156e+08 -14.1% 4.427e+08 perf-stat.ps.branch-misses
4556813 ± 12% +687.7% 35896229 ± 4% perf-stat.ps.cache-misses
21967784 -14.8% 18706255 perf-stat.ps.context-switches
44284414 -14.6% 37805127 perf-stat.ps.dTLB-load-misses
5.302e+10 -13.6% 4.58e+10 perf-stat.ps.dTLB-loads
3.219e+10 -13.7% 2.777e+10 perf-stat.ps.dTLB-stores
67799006 -21.9% 52946940 ± 2% perf-stat.ps.iTLB-load-misses
1.83e+11 -13.5% 1.584e+11 perf-stat.ps.instructions
274060 ± 40% +534.0% 1737650 ± 8% perf-stat.ps.node-load-misses
5.534e+13 -13.5% 4.788e+13 perf-stat.total.instructions
29.33 -0.8 28.53 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write.ksys_write
28.26 -0.8 27.48 perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write
28.70 -0.8 27.93 perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write
28.51 -0.8 27.76 perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.new_sync_write
32.31 -0.5 31.76 perf-profile.calltrace.cycles-pp.pipe_write.new_sync_write.vfs_write.ksys_write.do_syscall_64
33.10 -0.5 32.56 perf-profile.calltrace.cycles-pp.new_sync_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
12.74 -0.5 12.20 perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_read.new_sync_read
14.03 -0.5 13.50 perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
13.95 -0.5 13.42 perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
34.07 -0.4 33.64 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
1.04 ± 2% +0.1 1.16 ± 3% perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.new_sync_read.vfs_read.ksys_read
0.68 ± 4% +0.1 0.81 ± 6% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.82 +0.2 1.04 ± 2% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.new_sync_read.vfs_read
1.00 +0.3 1.32 ± 3% perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.new_sync_read.vfs_read.ksys_read
37.78 +0.3 38.13 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
38.34 +0.4 38.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read
1.38 ± 3% -0.9 0.51 ± 5% perf-profile.children.cycles-pp.__task_pid_nr_ns
1.53 ± 3% -0.9 0.67 ± 5% perf-profile.children.cycles-pp.perf_event_pid_type
2.36 ± 3% -0.8 1.55 ± 4% perf-profile.children.cycles-pp.__perf_event_header__init_id
29.35 -0.8 28.55 perf-profile.children.cycles-pp.__wake_up_common_lock
28.28 -0.8 27.50 perf-profile.children.cycles-pp.try_to_wake_up
28.70 -0.8 27.94 perf-profile.children.cycles-pp.__wake_up_common
28.52 -0.8 27.77 perf-profile.children.cycles-pp.autoremove_wake_function
33.12 -0.5 32.58 perf-profile.children.cycles-pp.new_sync_write
32.35 -0.5 31.80 perf-profile.children.cycles-pp.pipe_write
12.75 -0.5 12.21 perf-profile.children.cycles-pp.dequeue_task_fair
13.96 -0.5 13.43 perf-profile.children.cycles-pp.enqueue_task_fair
14.03 -0.5 13.50 perf-profile.children.cycles-pp.ttwu_do_activate
34.08 -0.4 33.66 perf-profile.children.cycles-pp.vfs_write
0.12 ± 5% -0.0 0.08 ± 3% perf-profile.children.cycles-pp.fput
0.12 ± 3% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.child
0.37 ± 2% -0.0 0.35 ± 2% perf-profile.children.cycles-pp.tick_sched_handle
0.10 ± 5% +0.0 0.12 ± 5% perf-profile.children.cycles-pp.__list_add_valid
0.20 ± 3% +0.0 0.23 ± 4% perf-profile.children.cycles-pp.make_kgid
0.09 ± 6% +0.0 0.12 ± 3% perf-profile.children.cycles-pp.clear_buddies
0.13 ± 5% +0.0 0.17 ± 5% perf-profile.children.cycles-pp.local_clock
0.05 ± 5% +0.0 0.08 ± 7% perf-profile.children.cycles-pp.rb_insert_color
0.11 ± 4% +0.0 0.14 ± 3% perf-profile.children.cycles-pp.check_cfs_rq_runtime
0.28 ± 3% +0.0 0.31 ± 3% perf-profile.children.cycles-pp.map_id_range_down
0.48 ± 3% +0.0 0.53 ± 3% perf-profile.children.cycles-pp.__might_sleep
0.35 ± 4% +0.1 0.40 ± 3% perf-profile.children.cycles-pp.__might_fault
0.83 ± 2% +0.1 0.88 ± 2% perf-profile.children.cycles-pp.set_next_entity
0.00 +0.1 0.06 ± 6% perf-profile.children.cycles-pp.default_wake_function
0.51 ± 3% +0.1 0.62 ± 3% perf-profile.children.cycles-pp.pick_next_entity
0.15 ± 6% +0.1 0.26 ± 9% perf-profile.children.cycles-pp.timestamp_truncate
0.40 ± 7% +0.1 0.52 ± 10% perf-profile.children.cycles-pp.file_update_time
1.07 ± 2% +0.1 1.19 ± 2% perf-profile.children.cycles-pp.copy_page_to_iter
0.00 +0.1 0.12 ± 34% perf-profile.children.cycles-pp.__mark_inode_dirty
0.00 +0.1 0.12 ± 32% perf-profile.children.cycles-pp.generic_update_time
1.28 ± 3% +0.2 1.45 ± 4% perf-profile.children.cycles-pp.security_file_permission
0.86 +0.2 1.06 ± 2% perf-profile.children.cycles-pp.atime_needs_update
2.51 +0.3 2.78 ± 2% perf-profile.children.cycles-pp.pick_next_task_fair
1.00 +0.3 1.32 ± 3% perf-profile.children.cycles-pp.touch_atime
1.37 ± 3% -0.9 0.50 ± 5% perf-profile.self.cycles-pp.__task_pid_nr_ns
1.23 ± 4% -0.4 0.86 ± 6% perf-profile.self.cycles-pp.update_curr
0.32 ± 3% -0.0 0.27 ± 3% perf-profile.self.cycles-pp.schedule
0.20 ± 6% -0.0 0.16 ± 7% perf-profile.self.cycles-pp.current_time
0.12 ± 2% -0.0 0.10 ± 6% perf-profile.self.cycles-pp.child
0.06 ± 6% +0.0 0.07 ± 5% perf-profile.self.cycles-pp.__might_fault
0.13 ± 3% +0.0 0.14 ± 3% perf-profile.self.cycles-pp.__cond_resched
0.12 ± 3% +0.0 0.14 ± 4% perf-profile.self.cycles-pp.put_prev_entity
0.12 ± 4% +0.0 0.14 ± 6% perf-profile.self.cycles-pp.touch_atime
0.08 ± 6% +0.0 0.10 ± 3% perf-profile.self.cycles-pp.clear_buddies
0.17 ± 4% +0.0 0.20 ± 6% perf-profile.self.cycles-pp.ksys_write
0.06 ± 7% +0.0 0.09 ± 4% perf-profile.self.cycles-pp.check_cfs_rq_runtime
0.05 +0.0 0.08 ± 7% perf-profile.self.cycles-pp.rb_insert_color
0.26 ± 4% +0.0 0.29 ± 3% perf-profile.self.cycles-pp.map_id_range_down
0.12 ± 7% +0.0 0.16 ± 5% perf-profile.self.cycles-pp.local_clock
0.17 ± 3% +0.0 0.21 ± 3% perf-profile.self.cycles-pp.set_next_entity
0.41 ± 4% +0.0 0.45 ± 3% perf-profile.self.cycles-pp.__might_sleep
0.00 +0.1 0.06 ± 5% perf-profile.self.cycles-pp.default_wake_function
0.37 ± 3% +0.1 0.46 ± 13% perf-profile.self.cycles-pp.vfs_write
0.39 ± 4% +0.1 0.49 ± 6% perf-profile.self.cycles-pp.new_sync_read
0.38 ± 3% +0.1 0.48 ± 3% perf-profile.self.cycles-pp.pick_next_entity
0.14 ± 7% +0.1 0.25 ± 10% perf-profile.self.cycles-pp.timestamp_truncate
0.00 +0.1 0.12 ± 34% perf-profile.self.cycles-pp.__mark_inode_dirty
0.86 ± 3% +0.1 1.00 ± 4% perf-profile.self.cycles-pp.pipe_write
0.36 ± 4% +0.1 0.50 ± 8% perf-profile.self.cycles-pp.vfs_read
0.26 ± 5% +0.1 0.41 ± 5% perf-profile.self.cycles-pp.atime_needs_update
0.19 ± 11% +0.2 0.35 ± 14% perf-profile.self.cycles-pp.security_file_permission
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.16.0-rc2-00001-ga4a118f2eead" of type "text/plain" (173517 bytes)
View attachment "job-script" of type "text/plain" (7850 bytes)
View attachment "job.yaml" of type "text/plain" (5269 bytes)
View attachment "reproduce" of type "text/plain" (347 bytes)
Powered by blists - more mailing lists