[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202411291513.ad55672a-lkp@intel.com>
Date: Fri, 29 Nov 2024 15:49:43 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Yin Fengwei <fengwei.yin@...el.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>, Yu Zhao <yuzhao@...gle.com>, "Ryan
Roberts" <ryan.roberts@....com>, David Hildenbrand <david@...hat.com>,
"Kefeng Wang" <wangkefeng.wang@...wei.com>, Matthew Wilcox
<willy@...radead.org>, Minchan Kim <minchan@...nel.org>, Vishal Moola
<vishal.moola@...il.com>, "Yang Shi" <shy828301@...il.com>,
<linux-mm@...ck.org>, <oliver.sang@...el.com>
Subject: [linus:master] [madvise] 2f406263e3: stress-ng.mremap.ops_per_sec
6.7% regression
Hello,
kernel test robot noticed a 6.7% regression of stress-ng.mremap.ops_per_sec on:
commit: 2f406263e3e954aa24c1248edcfa9be0c1bb30fa ("madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on fix commit cc864ebba5f612ce2960e7e09322a193e8fda0d7]
testcase: stress-ng
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: mremap
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202411291513.ad55672a-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241129/202411291513.ad55672a-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/mremap/stress-ng/60s
commit:
6867c7a332 ("mm: multi-gen LRU: don't spin during memcg release")
2f406263e3 ("madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check")
6867c7a3320669cb 2f406263e3e954aa24c1248edcf
---------------- ---------------------------
%stddev %change %stddev
\ | \
36.80 ± 7% +4.1 40.91 mpstat.cpu.all.sys%
325.67 ± 44% +119.1% 713.67 ± 13% perf-c2c.HITM.local
63.83 ± 67% +175.7% 176.00 ± 20% perf-c2c.HITM.remote
9.59 ± 19% -36.7% 6.07 ± 31% perf-sched.sch_delay.avg.ms.__cond_resched.generic_perform_write.generic_file_write_iter.vfs_write.ksys_write
0.02 ± 9% +48.0% 0.03 ± 30% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.01 ± 3% +73.9% 0.03 ± 20% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
936.50 ± 27% +49.9% 1403 ± 9% perf-sched.wait_and_delay.count.__cond_resched.shrink_folio_list.reclaim_folio_list.reclaim_pages.madvise_cold_or_pageout_pte_range
374720 ± 2% -6.7% 349433 stress-ng.mremap.ops
6245 ± 2% -6.7% 5823 stress-ng.mremap.ops_per_sec
2.353e+08 ± 2% -6.8% 2.194e+08 stress-ng.time.minor_page_faults
2213 ± 4% -7.0% 2057 stress-ng.time.user_time
2.22e+08 ± 2% -6.8% 2.069e+08 proc-vmstat.numa_hit
2.219e+08 ± 2% -6.8% 2.067e+08 proc-vmstat.numa_local
4.117e+08 ± 2% -6.7% 3.842e+08 proc-vmstat.pgalloc_normal
2.357e+08 ± 2% -6.7% 2.198e+08 proc-vmstat.pgfault
4.115e+08 ± 2% -6.7% 3.84e+08 proc-vmstat.pgfree
350460 ± 2% -6.8% 326755 proc-vmstat.thp_deferred_split_page
374783 ± 2% -6.7% 349496 proc-vmstat.thp_fault_alloc
24286 ± 2% +278.1% 91836 ± 39% proc-vmstat.thp_split_page
374810 ± 2% -6.7% 349527 proc-vmstat.thp_split_pmd
24286 ± 2% -6.5% 22708 proc-vmstat.thp_swpout_fallback
1.69e+09 ± 2% -6.1% 1.587e+09 perf-stat.i.cache-references
4.37 +1.7% 4.44 perf-stat.i.cpi
203.06 ± 3% -12.2% 178.34 ± 4% perf-stat.i.cpu-migrations
4.438e+10 -1.5% 4.372e+10 perf-stat.i.instructions
0.23 -1.6% 0.23 perf-stat.i.ipc
4.38 +1.7% 4.46 perf-stat.overall.cpi
171.29 ± 4% +5.7% 180.97 perf-stat.overall.cycles-between-cache-misses
0.23 -1.7% 0.22 perf-stat.overall.ipc
1.664e+09 ± 2% -6.2% 1.562e+09 perf-stat.ps.cache-references
199.46 ± 3% -12.3% 174.85 ± 5% perf-stat.ps.cpu-migrations
4.368e+10 -1.5% 4.3e+10 perf-stat.ps.instructions
2.688e+12 -1.9% 2.637e+12 perf-stat.total.instructions
7.77 ± 2% -0.3 7.46 ± 3% perf-profile.calltrace.cycles-pp.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range
7.63 ± 2% -0.3 7.32 ± 3% perf-profile.calltrace.cycles-pp.clear_huge_page.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.__get_user_pages
7.26 ± 2% -0.3 6.98 ± 3% perf-profile.calltrace.cycles-pp.clear_page_erms.clear_huge_page.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
0.26 ±100% +0.7 0.92 ± 20% perf-profile.calltrace.cycles-pp.folio_isolate_lru.madvise_cold_or_pageout_pte_range.walk_pmd_range.walk_pud_range.walk_p4d_range
0.00 +0.8 0.78 ± 22% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.folio_lruvec_lock_irq.folio_isolate_lru.madvise_cold_or_pageout_pte_range
0.00 +0.8 0.81 ± 22% perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.folio_lruvec_lock_irq.folio_isolate_lru.madvise_cold_or_pageout_pte_range.walk_pmd_range
0.00 +0.8 0.82 ± 21% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irq.folio_isolate_lru.madvise_cold_or_pageout_pte_range.walk_pmd_range.walk_pud_range
7.70 ± 2% -0.3 7.38 ± 3% perf-profile.children.cycles-pp.clear_huge_page
7.77 ± 2% -0.3 7.46 ± 3% perf-profile.children.cycles-pp.__do_huge_pmd_anonymous_page
0.10 ± 4% -0.0 0.08 perf-profile.children.cycles-pp.__call_rcu_common
0.12 ± 4% -0.0 0.10 ± 3% perf-profile.children.cycles-pp.vm_normal_page
0.24 ± 9% +0.1 0.29 ± 5% perf-profile.children.cycles-pp.folio_add_lru
0.16 ± 3% +0.1 0.22 ± 13% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.06 ± 17% +0.1 0.18 ± 35% perf-profile.children.cycles-pp.__free_one_page
0.07 ± 10% +0.1 0.19 ± 33% perf-profile.children.cycles-pp.page_counter_uncharge
0.77 ± 6% +0.1 0.89 ± 11% perf-profile.children.cycles-pp._raw_spin_lock
0.26 ± 5% +0.1 0.40 ± 13% perf-profile.children.cycles-pp.free_unref_page_list
0.08 ± 8% +0.1 0.23 ± 30% perf-profile.children.cycles-pp.uncharge_batch
0.34 ± 20% +0.1 0.49 ± 19% perf-profile.children.cycles-pp.get_swap_pages
0.08 ± 11% +0.2 0.24 ± 40% perf-profile.children.cycles-pp.free_pcppages_bulk
0.00 +0.2 0.16 ± 50% perf-profile.children.cycles-pp.__mem_cgroup_uncharge
0.43 ± 7% +0.4 0.82 ± 21% perf-profile.children.cycles-pp.folio_lruvec_lock_irq
0.42 ± 7% +0.4 0.82 ± 22% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.49 ± 6% +0.4 0.92 ± 20% perf-profile.children.cycles-pp.folio_isolate_lru
0.11 ± 7% +0.6 0.73 ± 52% perf-profile.children.cycles-pp.madvise_cold
0.00 +0.8 0.76 ± 56% perf-profile.children.cycles-pp.__page_cache_release
0.00 +0.9 0.89 ± 57% perf-profile.children.cycles-pp.__folio_put
1.23 ± 10% +1.0 2.22 ± 26% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
0.11 ± 12% +1.1 1.17 ± 53% perf-profile.children.cycles-pp.__split_huge_page
0.12 ± 11% +1.2 1.31 ± 56% perf-profile.children.cycles-pp.split_huge_page_to_list
0.11 ± 4% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.do_vmi_align_munmap
0.12 ± 3% +0.0 0.14 ± 5% perf-profile.self.cycles-pp.madvise_cold_or_pageout_pte_range
0.15 ± 5% +0.1 0.21 ± 12% perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
0.39 ± 15% +0.1 0.49 ± 4% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.06 ± 13% +0.1 0.18 ± 34% perf-profile.self.cycles-pp.__free_one_page
0.06 ± 11% +0.1 0.18 ± 33% perf-profile.self.cycles-pp.page_counter_uncharge
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists