[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202406201010.a1344783-oliver.sang@intel.com>
Date: Thu, 20 Jun 2024 10:39:50 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>, "Huang, Ying"
<ying.huang@...el.com>, David Hildenbrand <david@...hat.com>, John Hubbard
<jhubbard@...dia.com>, Kefeng Wang <wangkefeng.wang@...wei.com>, Mel Gorman
<mgorman@...hsingularity.net>, Ryan Roberts <ryan.roberts@....com>,
<linux-mm@...ck.org>, <feng.tang@...el.com>, <fengwei.yin@...el.com>,
<oliver.sang@...el.com>
Subject: [linus:master] [mm] d2136d749d: vm-scalability.throughput -7.1%
regression
Hello,
kernel test robot noticed a -7.1% regression of vm-scalability.throughput on:
commit: d2136d749d76af980b3accd72704eea4eab625bd ("mm: support multi-size THP numa balancing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on linus/master 92e5605a199efbaee59fb19e15d6cc2103a04ec2]
testcase: vm-scalability
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
parameters:
runtime: 300s
size: 512G
test: anon-cow-rand-hugetlb
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202406201010.a1344783-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240620/202406201010.a1344783-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/300s/512G/lkp-icl-2sp2/anon-cow-rand-hugetlb/vm-scalability
commit:
6b0ed7b3c7 ("mm: factor out the numa mapping rebuilding into a new helper")
d2136d749d ("mm: support multi-size THP numa balancing")
6b0ed7b3c77547d2 d2136d749d76af980b3accd7270
---------------- ---------------------------
%stddev %change %stddev
\ | \
12.02 -1.3 10.72 ± 4% mpstat.cpu.all.sys%
1228757 +3.0% 1265679 proc-vmstat.pgfault
7392513 -7.1% 6865649 vm-scalability.throughput
17356 +9.4% 18986 vm-scalability.time.user_time
0.32 ± 22% -36.9% 0.20 ± 17% sched_debug.cfs_rq:/.h_nr_running.stddev
28657 ± 86% -90.8% 2640 ± 19% sched_debug.cfs_rq:/.load.stddev
0.28 ± 35% -52.1% 0.13 ± 29% sched_debug.cfs_rq:/.nr_running.stddev
299.88 ± 27% -39.6% 181.04 ± 23% sched_debug.cfs_rq:/.runnable_avg.stddev
284.88 ± 32% -44.0% 159.65 ± 27% sched_debug.cfs_rq:/.util_avg.stddev
0.32 ± 22% -37.2% 0.20 ± 17% sched_debug.cpu.nr_running.stddev
1.584e+10 ± 2% -6.9% 1.476e+10 ± 3% perf-stat.i.branch-instructions
11673151 ± 3% -6.3% 10935072 ± 4% perf-stat.i.branch-misses
4.90 +3.5% 5.07 perf-stat.i.cpi
333.40 +7.5% 358.32 perf-stat.i.cycles-between-cache-misses
6.787e+10 ± 2% -6.8% 6.324e+10 ± 3% perf-stat.i.instructions
0.25 -6.2% 0.24 perf-stat.i.ipc
4.19 +7.5% 4.51 perf-stat.overall.cpi
323.02 +7.4% 346.94 perf-stat.overall.cycles-between-cache-misses
0.24 -7.0% 0.22 perf-stat.overall.ipc
1.549e+10 ± 2% -6.8% 1.444e+10 ± 3% perf-stat.ps.branch-instructions
6.634e+10 ± 2% -6.7% 6.186e+10 ± 3% perf-stat.ps.instructions
17.33 ± 77% -10.6 6.72 ±169% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access
17.30 ± 77% -10.6 6.71 ±169% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access
17.30 ± 77% -10.6 6.71 ±169% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
17.28 ± 77% -10.6 6.70 ±169% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
17.27 ± 77% -10.6 6.70 ±169% perf-profile.calltrace.cycles-pp.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
13.65 ± 76% -8.4 5.29 ±168% perf-profile.calltrace.cycles-pp.hugetlb_wp.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
13.37 ± 76% -8.2 5.18 ±168% perf-profile.calltrace.cycles-pp.copy_user_large_folio.hugetlb_wp.hugetlb_fault.handle_mm_fault.do_user_addr_fault
13.35 ± 76% -8.2 5.18 ±168% perf-profile.calltrace.cycles-pp.copy_subpage.copy_user_large_folio.hugetlb_wp.hugetlb_fault.handle_mm_fault
13.23 ± 76% -8.1 5.13 ±168% perf-profile.calltrace.cycles-pp.copy_mc_enhanced_fast_string.copy_subpage.copy_user_large_folio.hugetlb_wp.hugetlb_fault
3.59 ± 78% -2.2 1.39 ±169% perf-profile.calltrace.cycles-pp.__mutex_lock.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
17.35 ± 77% -10.6 6.73 ±169% perf-profile.children.cycles-pp.asm_exc_page_fault
17.32 ± 77% -10.6 6.72 ±168% perf-profile.children.cycles-pp.do_user_addr_fault
17.32 ± 77% -10.6 6.72 ±168% perf-profile.children.cycles-pp.exc_page_fault
17.30 ± 77% -10.6 6.71 ±168% perf-profile.children.cycles-pp.handle_mm_fault
17.28 ± 77% -10.6 6.70 ±169% perf-profile.children.cycles-pp.hugetlb_fault
13.65 ± 76% -8.4 5.29 ±168% perf-profile.children.cycles-pp.hugetlb_wp
13.37 ± 76% -8.2 5.18 ±168% perf-profile.children.cycles-pp.copy_user_large_folio
13.35 ± 76% -8.2 5.18 ±168% perf-profile.children.cycles-pp.copy_subpage
13.34 ± 76% -8.2 5.17 ±168% perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
3.59 ± 78% -2.2 1.39 ±169% perf-profile.children.cycles-pp.__mutex_lock
13.24 ± 76% -8.1 5.13 ±168% perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists