lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202512101408.af3876df-lkp@intel.com>
Date: Wed, 10 Dec 2025 14:15:49 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Shakeel Butt <shakeel.butt@...ux.dev>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>, Harry Yoo <harry.yoo@...cle.com>,
	Roman Gushchin <roman.gushchin@...ux.dev>, Vlastimil Babka <vbabka@...e.cz>,
	Johannes Weiner <hannes@...xchg.org>, Michal Hocko <mhocko@...nel.org>,
	Muchun Song <muchun.song@...ux.dev>, Qi Zheng <zhengqi.arch@...edance.com>,
	<cgroups@...r.kernel.org>, <linux-mm@...ck.org>, <oliver.sang@...el.com>
Subject: [linus:master] [memcg]  7e44d00a13:  will-it-scale.per_thread_ops
 2.6% regression



Hello,

kernel test robot noticed a 2.6% regression of will-it-scale.per_thread_ops on:


commit: 7e44d00a13ca5691caf4f7c46541ee60bf75b208 ("memcg: use mod_node_page_state to update stats")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linux-next/master 6987d58a9cbc5bd57c983baa514474a86c945d56]

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_task: 100%
	mode: thread
	test: page_fault2
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202512101408.af3876df-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251210/202512101408.af3876df-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/thread/100%/debian-13-x86_64-20250902.cgz/lkp-icl-2sp7/page_fault2/will-it-scale

commit: 
  3e700b715e ("selftests/mm: gup_test: fix comment regarding origin of FOLL_WRITE")
  7e44d00a13 ("memcg: use mod_node_page_state to update stats")

3e700b715e1cef66 7e44d00a13ca5691caf4f7c4654 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   3453930            -2.6%    3363916        will-it-scale.64.threads
     53967            -2.6%      52560        will-it-scale.per_thread_ops
   3453930            -2.6%    3363916        will-it-scale.workload
 1.053e+09            -2.6%  1.025e+09        proc-vmstat.numa_hit
 1.052e+09            -2.6%  1.025e+09        proc-vmstat.numa_local
  1.05e+09            -2.6%  1.023e+09        proc-vmstat.pgalloc_normal
 1.045e+09            -2.6%  1.018e+09        proc-vmstat.pgfault
  1.05e+09            -2.6%  1.023e+09        proc-vmstat.pgfree
 3.452e+09            -2.0%  3.383e+09        perf-stat.i.branch-instructions
      0.45            +0.0        0.46        perf-stat.i.branch-miss-rate%
 4.559e+08            -2.5%  4.446e+08        perf-stat.i.cache-misses
 4.696e+08            -2.5%   4.58e+08        perf-stat.i.cache-references
  3.88e+10            -2.4%  3.787e+10        perf-stat.i.cpu-cycles
 1.741e+10            -1.5%  1.715e+10        perf-stat.i.instructions
    107.43            -2.5%     104.76        perf-stat.i.metric.K/sec
   3437960            -2.5%    3352362        perf-stat.i.minor-faults
   3437961            -2.5%    3352362        perf-stat.i.page-faults
     26.18           -34.0%      17.29 ± 70%  perf-stat.overall.MPKI
 3.441e+09           -34.7%  2.247e+09 ± 70%  perf-stat.ps.branch-instructions
 4.544e+08           -35.0%  2.953e+08 ± 70%  perf-stat.ps.cache-misses
  4.68e+08           -35.0%  3.042e+08 ± 70%  perf-stat.ps.cache-references
 3.867e+10           -34.9%  2.517e+10 ± 70%  perf-stat.ps.cpu-cycles
 1.736e+10           -34.4%  1.139e+10 ± 70%  perf-stat.ps.instructions
   3426140           -35.0%    2226448 ± 70%  perf-stat.ps.minor-faults
   3426140           -35.0%    2226448 ± 70%  perf-stat.ps.page-faults
 5.293e+12           -34.4%  3.471e+12 ± 70%  perf-stat.total.instructions
     92.62            -0.2       92.40        perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
      1.39            +0.0        1.42        perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
      1.89            +0.0        1.92        perf-profile.calltrace.cycles-pp.lru_add.folio_batch_move_lru.__folio_batch_add_and_move.set_pte_range.finish_fault
      0.91            +0.0        0.95        perf-profile.calltrace.cycles-pp.lru_gen_del_folio.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
      1.30            +0.0        1.35        perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
      2.38            +0.0        2.43        perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
      1.36            +0.1        1.41        perf-profile.calltrace.cycles-pp.lru_gen_add_folio.lru_add.folio_batch_move_lru.__folio_batch_add_and_move.set_pte_range
      4.49            +0.2        4.64        perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      7.62            +0.2        7.79        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap
      7.62            +0.2        7.79        perf-profile.calltrace.cycles-pp.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap
      7.62            +0.2        7.79        perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas
      7.61            +0.2        7.78        perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.vms_clear_ptes
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      8.14            +0.2        8.31        perf-profile.calltrace.cycles-pp.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
      8.14            +0.2        8.32        perf-profile.calltrace.cycles-pp.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.__munmap
      8.15            +0.2        8.32        perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
      8.15            +0.2        8.32        perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
     21.20            +0.3       21.46        perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
     90.18            -0.2       89.94        perf-profile.children.cycles-pp.testcase
     86.53            -0.2       86.34        perf-profile.children.cycles-pp.asm_exc_page_fault
      1.58            +0.0        1.63        perf-profile.children.cycles-pp.__page_cache_release
      1.39            +0.0        1.44        perf-profile.children.cycles-pp.lru_gen_add_folio
      1.07            +0.0        1.12        perf-profile.children.cycles-pp.lru_gen_del_folio
      1.33            +0.0        1.38        perf-profile.children.cycles-pp.folio_remove_rmap_ptes
      1.42            +0.1        1.48 ±  2%  perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
      0.36 ±  2%      +0.1        0.42        perf-profile.children.cycles-pp.__mod_lruvec_state
      3.08            +0.1        3.16        perf-profile.children.cycles-pp.folios_put_refs
      4.56            +0.1        4.71        perf-profile.children.cycles-pp.zap_present_ptes
      7.64            +0.2        7.80        perf-profile.children.cycles-pp.unmap_page_range
      7.64            +0.2        7.80        perf-profile.children.cycles-pp.unmap_vmas
      7.64            +0.2        7.80        perf-profile.children.cycles-pp.zap_pmd_range
      7.64            +0.2        7.80        perf-profile.children.cycles-pp.zap_pte_range
      8.17            +0.2        8.34        perf-profile.children.cycles-pp.__x64_sys_munmap
      8.14            +0.2        8.31        perf-profile.children.cycles-pp.vms_clear_ptes
      8.17            +0.2        8.34        perf-profile.children.cycles-pp.__vm_munmap
      8.15            +0.2        8.32        perf-profile.children.cycles-pp.vms_complete_munmap_vmas
      8.17            +0.2        8.34        perf-profile.children.cycles-pp.__munmap
      8.15            +0.2        8.32        perf-profile.children.cycles-pp.do_vmi_align_munmap
      8.15            +0.2        8.33        perf-profile.children.cycles-pp.do_vmi_munmap
      8.41            +0.2        8.59        perf-profile.children.cycles-pp.do_syscall_64
      8.41            +0.2        8.59        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     21.41            +0.3       21.67        perf-profile.children.cycles-pp.finish_fault
      0.00            +0.8        0.78        perf-profile.children.cycles-pp.mod_node_page_state
      0.36 ±  3%      -0.0        0.34 ±  2%  perf-profile.self.cycles-pp.free_pages_and_swap_cache
      0.53 ±  2%      -0.0        0.50        perf-profile.self.cycles-pp.do_user_addr_fault
      0.18 ±  2%      -0.0        0.15 ±  2%  perf-profile.self.cycles-pp.__page_cache_release
      0.49            +0.0        0.53 ±  3%  perf-profile.self.cycles-pp.folios_put_refs
      3.07            +0.1        3.17        perf-profile.self.cycles-pp.zap_present_ptes
      0.00            +0.7        0.73        perf-profile.self.cycles-pp.mod_node_page_state




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ