lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202311231029.3aa790-oliver.sang@intel.com>
Date:   Thu, 23 Nov 2023 13:03:34 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Huang Ying <ying.huang@...el.com>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Vlastimil Babka <vbabka@...e.cz>,
        "David Hildenbrand" <david@...hat.com>,
        Johannes Weiner <jweiner@...hat.com>,
        "Dave Hansen" <dave.hansen@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>,
        "Pavel Tatashin" <pasha.tatashin@...een.com>,
        Matthew Wilcox <willy@...radead.org>,
        Christoph Lameter <cl@...ux.com>,
        Arjan van de Ven <arjan@...ux.intel.com>,
        Sudeep Holla <sudeep.holla@....com>, <linux-mm@...ck.org>,
        <ying.huang@...el.com>, <feng.tang@...el.com>,
        <fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [mm, pcp]  6ccdcb6d3a:  stress-ng.judy.ops_per_sec
 -4.7% regression



Hello,

kernel test robot noticed a -4.7% regression of stress-ng.judy.ops_per_sec on:


commit: 6ccdcb6d3a741c4e005ca6ffd4a62ddf8b5bead3 ("mm, pcp: reduce detecting time of consecutive high order page freeing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	class: cpu-cache
	test: judy
	disk: 1SSD
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | lmbench3: lmbench3.TCP.socket.bandwidth.10MB.MB/sec 23.7% improvement                           |
| test machine     | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory |
| test parameters  | cpufreq_governor=performance                                                                    |
|                  | mode=development                                                                                |
|                  | nr_threads=100%                                                                                 |
|                  | test=TCP                                                                                        |
|                  | test_memory_size=50%                                                                            |
+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.file-ioctl.ops_per_sec -6.6% regression                                    |
| test machine     | 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory        |
| test parameters  | class=filesystem                                                                                |
|                  | cpufreq_governor=performance                                                                    |
|                  | disk=1SSD                                                                                       |
|                  | fs=btrfs                                                                                        |
|                  | nr_threads=10%                                                                                  |
|                  | test=file-ioctl                                                                                 |
|                  | testtime=60s                                                                                    |
+------------------+-------------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202311231029.3aa790-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231123/202311231029.3aa790-oliver.sang@intel.com

=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  cpu-cache/gcc-12/performance/1SSD/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-spr-2sp4/judy/stress-ng/60s

commit: 
  57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark")
  6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing")

57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      4.57 ±  5%     +46.8%       6.71 ± 17%  iostat.cpu.system
      2842            +1.0%       2871        turbostat.Bzy_MHz
      0.12 ±  3%      +0.4        0.55 ± 26%  mpstat.cpu.all.soft%
      3.05 ±  6%      +1.8        4.86 ± 20%  mpstat.cpu.all.sys%
  81120642            -2.9%   78746159        proc-vmstat.numa_hit
  80886548            -2.9%   78513494        proc-vmstat.numa_local
  82771023            -2.9%   80399459        proc-vmstat.pgalloc_normal
  82356596            -2.9%   79991041        proc-vmstat.pgfree
  12325708 ±  3%      +5.3%   12974746        perf-stat.i.dTLB-load-misses
      0.38 ± 44%     +27.2%       0.48        perf-stat.overall.cpi
    668.74 ± 44%     +24.7%     834.02        perf-stat.overall.cycles-between-cache-misses
      0.00 ± 45%      +0.0        0.01 ± 10%  perf-stat.overall.dTLB-load-miss-rate%
  10040254 ± 44%     +26.0%   12650801        perf-stat.ps.dTLB-load-misses
   7036371 ±  3%      -2.8%    6842720        stress-ng.judy.Judy_delete_operations_per_sec
   9244466 ±  3%      -7.8%    8524505 ±  3%  stress-ng.judy.Judy_insert_operations_per_sec
      2912 ±  3%      -4.7%       2774        stress-ng.judy.ops_per_sec
     13316 ±  8%     +22.8%      16355 ± 13%  stress-ng.time.maximum_resident_set_size
    445.86 ±  5%     +64.2%     732.21 ± 15%  stress-ng.time.system_time
     40885 ± 40%    +373.8%     193712 ± 11%  sched_debug.cfs_rq:/.left_vruntime.avg
    465264 ± 31%    +142.5%    1128399 ±  5%  sched_debug.cfs_rq:/.left_vruntime.stddev
      8322 ± 34%    +140.8%      20039 ± 17%  sched_debug.cfs_rq:/.load.avg
     40886 ± 40%    +373.8%     193713 ± 11%  sched_debug.cfs_rq:/.right_vruntime.avg
    465274 ± 31%    +142.5%    1128401 ±  5%  sched_debug.cfs_rq:/.right_vruntime.stddev
    818.77 ± 10%     +43.3%       1172 ±  5%  sched_debug.cpu.curr->pid.stddev
      0.05 ± 74%    +659.6%       0.41 ± 35%  perf-sched.sch_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
      0.10 ± 48%    +140.3%       0.24 ± 11%  perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
      0.01 ± 14%    +102.6%       0.03 ± 29%  perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.05 ±122%   +1322.6%       0.65 ± 20%  perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open
      1.70 ± 79%    +729.3%      14.10 ± 48%  perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      1.08 ±101%    +233.4%       3.60 ±  7%  perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open
      0.01 ±  8%     +54.7%       0.02 ± 18%  perf-sched.total_sch_delay.average.ms
      0.18 ±  5%    +555.7%       1.20 ± 38%  perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
      0.21 ±  4%    +524.6%       1.29 ± 47%  perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
    235.65 ± 31%     -57.0%     101.40 ± 17%  perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
    127.50 ±100%    +126.3%     288.50 ±  9%  perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
    125.83 ±144%    +407.2%     638.17 ± 27%  perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
    344.50 ± 36%    +114.6%     739.33 ± 24%  perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.92 ±114%    +482.2%       5.38 ± 47%  perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
      3.22 ± 89%    +223.9%      10.44 ± 50%  perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
      0.18 ± 43%    +471.8%       1.01 ± 36%  perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_anonymous_page
     34.39 ± 46%     +88.8%      64.95 ± 18%  perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.21 ± 13%    +813.6%       1.95 ± 38%  perf-sched.wait_time.avg.ms.__cond_resched.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.constprop
      0.18 ± 15%    +457.1%       1.02 ± 58%  perf-sched.wait_time.avg.ms.__cond_resched.unmap_vmas.unmap_region.constprop.0
    417.61 ± 68%     -87.6%      51.85 ±146%  perf-sched.wait_time.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.22 ± 25%    +614.2%       1.57 ± 71%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
      0.18 ±  5%    +556.3%       1.20 ± 38%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
      0.21 ±  4%    +524.6%       1.29 ± 47%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
     38.72 ± 39%     -53.1%      18.17 ± 30%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
    235.60 ± 31%     -57.0%     101.37 ± 17%  perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      2.17 ± 30%     +45.3%       3.16 ± 13%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      1.02 ±131%    +574.3%       6.90 ± 52%  perf-sched.wait_time.max.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_anonymous_page
      0.18 ±191%  +92359.0%     169.05 ±219%  perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
     69.64 ± 44%     +33.2%      92.76 ±  4%  perf-sched.wait_time.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.64 ± 67%    +653.6%       4.82 ± 54%  perf-sched.wait_time.max.ms.__cond_resched.unmap_vmas.unmap_region.constprop.0
      1.75 ± 49%    +206.5%       5.38 ± 47%  perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
      3.22 ± 89%    +223.9%      10.44 ± 50%  perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi


***************************************************************************************************
lkp-ivb-2ep1: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase:
  gcc-12/performance/x86_64-rhel-8.3/development/100%/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/TCP/50%/lmbench3

commit: 
  57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark")
  6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing")

57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.07 ± 38%    +105.0%       0.14 ± 32%  perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open
     26.75            -4.9%      25.45        turbostat.RAMWatt
    678809            +7.2%     727594 ±  2%  vmstat.system.cs
  97929782           -13.1%   85054266        numa-numastat.node0.local_node
  97933343           -13.1%   85056081        numa-numastat.node0.numa_hit
  97933344           -13.1%   85055901        numa-vmstat.node0.numa_hit
  97929783           -13.1%   85054086        numa-vmstat.node0.numa_local
     32188           +23.7%      39813        lmbench3.TCP.socket.bandwidth.10MB.MB/sec
    652.63            -4.4%     624.04        lmbench3.time.elapsed_time
    652.63            -4.4%     624.04        lmbench3.time.elapsed_time.max
      8597            -5.9%       8092        lmbench3.time.system_time
      0.88 ±  7%      -0.1        0.76 ±  5%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.71 ± 10%      -0.1        0.61 ±  7%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
      0.78 ±  3%      -0.1        0.70 ±  6%  perf-profile.children.cycles-pp.security_socket_recvmsg
      0.36 ±  9%      +0.1        0.42 ± 11%  perf-profile.children.cycles-pp.skb_page_frag_refill
      0.40 ± 10%      +0.1        0.48 ± 12%  perf-profile.children.cycles-pp.sk_page_frag_refill
      0.51 ±  4%      -0.1        0.44 ± 13%  perf-profile.self.cycles-pp.sock_read_iter
      0.36 ± 10%      +0.1        0.42 ± 11%  perf-profile.self.cycles-pp.skb_page_frag_refill
    158897 ±  2%      -6.8%     148107        proc-vmstat.nr_anon_pages
    160213 ±  2%      -6.8%     149290        proc-vmstat.nr_inactive_anon
    160213 ±  2%      -6.8%     149290        proc-vmstat.nr_zone_inactive_anon
 1.715e+08            -7.1%  1.593e+08        proc-vmstat.numa_hit
 1.715e+08            -7.1%  1.592e+08        proc-vmstat.numa_local
 1.367e+09            -7.1%   1.27e+09        proc-vmstat.pgalloc_normal
   2324641            -2.7%    2261187        proc-vmstat.pgfault
 1.367e+09            -7.1%   1.27e+09        proc-vmstat.pgfree
     77011            -4.4%      73597        proc-vmstat.pgreuse
      5.99 ±  3%     -29.9%       4.20 ±  4%  perf-stat.i.MPKI
 7.914e+09 ±  2%      +4.5%  8.271e+09        perf-stat.i.branch-instructions
  1.51e+08            +4.6%  1.579e+08        perf-stat.i.branch-misses
      7.65 ±  4%      -0.9        6.73 ±  3%  perf-stat.i.cache-miss-rate%
  66394790 ±  2%     -21.9%   51865866 ±  3%  perf-stat.i.cache-misses
    682132            +7.2%     731279 ±  2%  perf-stat.i.context-switches
      4.01           -16.0%       3.37        perf-stat.i.cpi
     71772 ±  4%     +11.5%      80055 ±  8%  perf-stat.i.cycles-between-cache-misses
 9.368e+09 ±  2%      +3.6%  9.706e+09        perf-stat.i.dTLB-stores
  33695419 ±  2%      +7.1%   36096466 ±  2%  perf-stat.i.iTLB-load-misses
    573897 ± 35%     -38.6%     352477 ± 19%  perf-stat.i.iTLB-loads
  4.09e+10 ±  2%      +4.5%  4.273e+10        perf-stat.i.instructions
      0.37            +4.3%       0.39        perf-stat.i.ipc
      0.09 ± 22%     -44.0%       0.05 ± 26%  perf-stat.i.major-faults
    490.16 ±  2%      -8.6%     448.21 ±  2%  perf-stat.i.metric.K/sec
    635.38 ±  2%      +3.5%     657.46        perf-stat.i.metric.M/sec
     37.54            +2.3       39.84        perf-stat.i.node-load-miss-rate%
   8300835 ±  2%     -10.8%    7406820 ±  2%  perf-stat.i.node-load-misses
  76993977 ±  3%      -6.6%   71936169 ±  3%  perf-stat.i.node-loads
     26.58 ±  4%      +4.1       30.71 ±  3%  perf-stat.i.node-store-miss-rate%
   2341211 ±  4%     -29.6%    1648802 ±  3%  perf-stat.i.node-store-misses
  34198780 ±  3%     -33.2%   22857201 ±  3%  perf-stat.i.node-stores
      1.63           -25.5%       1.21 ±  3%  perf-stat.overall.MPKI
     10.67            -2.3        8.36        perf-stat.overall.cache-miss-rate%
      2.83            -5.2%       2.69        perf-stat.overall.cpi
      1740           +27.3%       2216 ±  3%  perf-stat.overall.cycles-between-cache-misses
      0.35            +5.5%       0.37        perf-stat.overall.ipc
      9.73            -0.4        9.34        perf-stat.overall.node-load-miss-rate%
      6.39            +0.3        6.72        perf-stat.overall.node-store-miss-rate%
 7.914e+09 ±  2%      +4.6%  8.276e+09        perf-stat.ps.branch-instructions
 1.509e+08            +4.7%  1.579e+08        perf-stat.ps.branch-misses
  66615187 ±  2%     -22.1%   51881477 ±  3%  perf-stat.ps.cache-misses
    679734            +7.2%     729007 ±  2%  perf-stat.ps.context-switches
 9.369e+09 ±  2%      +3.7%  9.712e+09        perf-stat.ps.dTLB-stores
  33673038 ±  2%      +7.2%   36098564 ±  2%  perf-stat.ps.iTLB-load-misses
  4.09e+10 ±  2%      +4.6%  4.276e+10        perf-stat.ps.instructions
      0.09 ± 23%     -44.4%       0.05 ± 26%  perf-stat.ps.major-faults
   8328473 ±  2%     -11.0%    7410272 ±  2%  perf-stat.ps.node-load-misses
  77301667 ±  3%      -6.9%   71997671 ±  3%  perf-stat.ps.node-loads
   2344250 ±  4%     -29.7%    1647553 ±  3%  perf-stat.ps.node-store-misses
  34315831 ±  3%     -33.4%   22865994 ±  3%  perf-stat.ps.node-stores



***************************************************************************************************
lkp-skl-d08: 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  filesystem/gcc-12/performance/1SSD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-skl-d08/file-ioctl/stress-ng/60s

commit: 
  57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark")
  6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing")

57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    127.00 ± 10%     +36.1%     172.83 ± 15%  perf-c2c.HITM.local
      0.00 ± 72%    +130.4%       0.01 ± 30%  perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.alloc_extent_state.__clear_extent_bit.btrfs_clone_files
     14.83 ± 19%     +33.7%      19.83 ± 10%  sched_debug.cpu.nr_uninterruptible.max
    339939            -6.6%     317593        stress-ng.file-ioctl.ops
      5665            -6.6%       5293        stress-ng.file-ioctl.ops_per_sec
      6444 ±  4%     -25.2%       4820 ±  5%  stress-ng.time.involuntary_context_switches
  89198237            -6.5%   83411572        proc-vmstat.numa_hit
  89117176            -6.8%   83056324        proc-vmstat.numa_local
  92833230            -6.6%   86743293        proc-vmstat.pgalloc_normal
  92791999            -6.6%   86700599        proc-vmstat.pgfree
      0.25 ± 56%    +110.2%       0.53 ± 12%  perf-stat.i.major-faults
    127575 ± 27%    +138.3%     303957 ±  3%  perf-stat.i.node-stores
      0.25 ± 56%    +110.2%       0.52 ± 12%  perf-stat.ps.major-faults
    125751 ± 27%    +138.3%     299653 ±  3%  perf-stat.ps.node-stores
 1.199e+12            -2.1%  1.174e+12        perf-stat.total.instructions
     15.80            -0.7       15.14        perf-profile.calltrace.cycles-pp.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep
     15.46            -0.6       14.84        perf-profile.calltrace.cycles-pp.btrfs_read_folio.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep
      9.84            -0.5        9.32        perf-profile.calltrace.cycles-pp.memcmp.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep.btrfs_remap_file_range
     11.95            -0.4       11.52        perf-profile.calltrace.cycles-pp.btrfs_do_readpage.btrfs_read_folio.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare
      8.72 ±  2%      -0.4        8.28        perf-profile.calltrace.cycles-pp.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep
      5.56 ±  2%      -0.4        5.18        perf-profile.calltrace.cycles-pp.__filemap_add_folio.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep
      0.64 ± 10%      -0.3        0.36 ± 71%  perf-profile.calltrace.cycles-pp.find_free_extent.btrfs_reserve_extent.__btrfs_prealloc_file_range.btrfs_prealloc_file_range.btrfs_fallocate
      2.57 ±  5%      -0.3        2.29 ±  2%  perf-profile.calltrace.cycles-pp.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      2.44 ±  6%      -0.3        2.17 ±  2%  perf-profile.calltrace.cycles-pp.btrfs_fallocate.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64
      2.53 ±  5%      -0.3        2.26 ±  2%  perf-profile.calltrace.cycles-pp.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.66 ±  9%      -0.2        0.46 ± 45%  perf-profile.calltrace.cycles-pp.btrfs_reserve_extent.__btrfs_prealloc_file_range.btrfs_prealloc_file_range.btrfs_fallocate.vfs_fallocate
      1.42 ±  3%      -0.1        1.31 ±  4%  perf-profile.calltrace.cycles-pp.clear_state_bit.__clear_extent_bit.btrfs_invalidate_folio.truncate_cleanup_folio.truncate_inode_pages_range
      0.70 ±  4%      -0.1        0.62 ±  2%  perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.__filemap_add_folio.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare
      0.69 ±  4%      -0.1        0.63 ±  4%  perf-profile.calltrace.cycles-pp.btrfs_punch_hole.btrfs_fallocate.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl
     29.90            +0.6       30.49        perf-profile.calltrace.cycles-pp.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep.btrfs_remap_file_range
      0.00            +0.9        0.86 ±  6%  perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
     68.10            +1.2       69.29        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
     68.47            +1.2       69.68        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
     67.35            +1.2       68.59        perf-profile.calltrace.cycles-pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
     21.54 ±  3%      +1.5       23.02        perf-profile.calltrace.cycles-pp.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
     21.51 ±  3%      +1.5       23.00        perf-profile.calltrace.cycles-pp.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl
     21.46 ±  3%      +1.5       22.94        perf-profile.calltrace.cycles-pp.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl
     21.53 ±  3%      +1.5       23.01        perf-profile.calltrace.cycles-pp.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64
      0.00            +1.5        1.49 ±  3%  perf-profile.calltrace.cycles-pp.__free_one_page.free_pcppages_bulk.free_unref_page_commit.free_unref_page.btrfs_clone
     21.15 ±  3%      +1.5       22.66        perf-profile.calltrace.cycles-pp.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone
     64.61            +1.5       66.16        perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      2.66 ±  2%      +1.8        4.51 ±  3%  perf-profile.calltrace.cycles-pp.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep
      0.97 ±  3%      +1.8        2.82 ±  5%  perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc.do_read_cache_folio
      2.02 ±  3%      +1.9        3.90 ±  4%  perf-profile.calltrace.cycles-pp.__alloc_pages.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep
      1.27 ±  2%      +1.9        3.17 ±  4%  perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare
      0.35 ± 70%      +2.0        2.31 ±  5%  perf-profile.calltrace.cycles-pp.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc
      0.00            +2.0        2.00 ±  4%  perf-profile.calltrace.cycles-pp.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages
      1.72 ±  2%      +2.1        3.78        perf-profile.calltrace.cycles-pp.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range
      0.00            +2.1        2.09 ±  2%  perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_commit.free_unref_page.btrfs_clone.btrfs_clone_files
      0.00            +2.1        2.12 ±  2%  perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range
      0.00            +2.1        2.14 ±  2%  perf-profile.calltrace.cycles-pp.free_unref_page.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range
     15.81            -0.7       15.15        perf-profile.children.cycles-pp.filemap_read_folio
     15.47            -0.6       14.86        perf-profile.children.cycles-pp.btrfs_read_folio
      9.89            -0.5        9.38        perf-profile.children.cycles-pp.memcmp
     11.98            -0.4       11.54        perf-profile.children.cycles-pp.btrfs_do_readpage
      8.74 ±  2%      -0.4        8.30        perf-profile.children.cycles-pp.filemap_add_folio
      9.73 ±  3%      -0.4        9.35        perf-profile.children.cycles-pp.__clear_extent_bit
      5.66 ±  2%      -0.4        5.30        perf-profile.children.cycles-pp.__filemap_add_folio
      2.45 ±  6%      -0.3        2.17 ±  2%  perf-profile.children.cycles-pp.btrfs_fallocate
      2.57 ±  5%      -0.3        2.29 ±  2%  perf-profile.children.cycles-pp.ioctl_preallocate
      2.53 ±  5%      -0.3        2.26 ±  2%  perf-profile.children.cycles-pp.vfs_fallocate
      4.67 ±  2%      -0.3        4.41 ±  3%  perf-profile.children.cycles-pp.__set_extent_bit
      4.83 ±  2%      -0.3        4.58 ±  3%  perf-profile.children.cycles-pp.lock_extent
      5.06 ±  2%      -0.2        4.82 ±  2%  perf-profile.children.cycles-pp.alloc_extent_state
      4.11 ±  2%      -0.2        3.94 ±  2%  perf-profile.children.cycles-pp.kmem_cache_alloc
      1.37 ±  4%      -0.1        1.25 ±  2%  perf-profile.children.cycles-pp.__mod_lruvec_page_state
      0.66 ±  9%      -0.1        0.54 ±  6%  perf-profile.children.cycles-pp.btrfs_reserve_extent
      0.64 ± 10%      -0.1        0.53 ±  6%  perf-profile.children.cycles-pp.find_free_extent
      0.96 ±  4%      -0.1        0.87 ±  6%  perf-profile.children.cycles-pp.__wake_up
      0.62 ±  4%      -0.1        0.54 ±  6%  perf-profile.children.cycles-pp.__cond_resched
      1.20 ±  4%      -0.1        1.12 ±  3%  perf-profile.children.cycles-pp.free_extent_state
      0.99 ±  3%      -0.1        0.92 ±  4%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.89 ±  3%      -0.1        0.81 ±  5%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.69 ±  4%      -0.1        0.64 ±  4%  perf-profile.children.cycles-pp.btrfs_punch_hole
      0.12 ± 10%      -0.0        0.09 ± 10%  perf-profile.children.cycles-pp.__fget_light
      0.02 ±141%      +0.0        0.06 ± 13%  perf-profile.children.cycles-pp.calc_available_free_space
      0.29 ±  8%      +0.1        0.39 ±  6%  perf-profile.children.cycles-pp.__mod_zone_page_state
      0.09 ± 17%      +0.2        0.25 ±  6%  perf-profile.children.cycles-pp.__kmalloc_node
      0.09 ± 15%      +0.2        0.25 ±  4%  perf-profile.children.cycles-pp.kvmalloc_node
      0.08 ± 11%      +0.2        0.24 ±  4%  perf-profile.children.cycles-pp.__kmalloc_large_node
      0.24 ± 13%      +0.2        0.41 ±  4%  perf-profile.children.cycles-pp.__list_add_valid_or_report
      0.32 ± 15%      +0.6        0.91 ±  4%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     30.03            +0.6       30.64        perf-profile.children.cycles-pp.do_read_cache_folio
      1.10 ±  4%      +0.6        1.72 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.58 ±  6%      +0.9        1.50 ±  5%  perf-profile.children.cycles-pp.__list_del_entry_valid_or_report
     67.36            +1.2       68.60        perf-profile.children.cycles-pp.__x64_sys_ioctl
     21.52 ±  3%      +1.5       23.00        perf-profile.children.cycles-pp.do_clone_file_range
     21.54 ±  3%      +1.5       23.02        perf-profile.children.cycles-pp.ioctl_file_clone
     21.53 ±  3%      +1.5       23.01        perf-profile.children.cycles-pp.vfs_clone_file_range
     21.16 ±  3%      +1.5       22.66        perf-profile.children.cycles-pp.btrfs_clone_files
      0.00            +1.5        1.52 ±  3%  perf-profile.children.cycles-pp.__free_one_page
     64.61            +1.5       66.16        perf-profile.children.cycles-pp.do_vfs_ioctl
     64.16            +1.5       65.71        perf-profile.children.cycles-pp.btrfs_remap_file_range
      2.68 ±  3%      +1.8        4.52 ±  3%  perf-profile.children.cycles-pp.folio_alloc
      0.54 ±  6%      +2.0        2.51 ±  5%  perf-profile.children.cycles-pp.__rmqueue_pcplist
      1.03 ±  3%      +2.0        3.04 ±  5%  perf-profile.children.cycles-pp.rmqueue
      2.16 ±  3%      +2.0        4.19 ±  4%  perf-profile.children.cycles-pp.__alloc_pages
      1.32 ±  2%      +2.1        3.42 ±  4%  perf-profile.children.cycles-pp.get_page_from_freelist
      0.00            +2.1        2.10 ±  2%  perf-profile.children.cycles-pp.free_pcppages_bulk
      2.66 ±  2%      +2.1        4.77        perf-profile.children.cycles-pp.btrfs_clone
      0.03 ±100%      +2.1        2.17 ±  2%  perf-profile.children.cycles-pp.free_unref_page
      0.40 ±  6%      +2.2        2.55 ±  2%  perf-profile.children.cycles-pp.free_unref_page_commit
      0.00            +2.2        2.21 ±  4%  perf-profile.children.cycles-pp.rmqueue_bulk
      9.82            -0.5        9.32        perf-profile.self.cycles-pp.memcmp
      0.84 ±  5%      -0.1        0.76 ±  6%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.13 ±  4%      -0.1        1.05 ±  2%  perf-profile.self.cycles-pp.free_extent_state
      0.99 ±  3%      -0.1        0.92 ±  4%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.22 ±  8%      -0.1        0.16 ± 13%  perf-profile.self.cycles-pp.find_free_extent
      0.38 ±  4%      -0.1        0.32 ±  8%  perf-profile.self.cycles-pp.__cond_resched
      0.12 ± 10%      -0.0        0.08 ± 11%  perf-profile.self.cycles-pp.__fget_light
      0.06 ±  7%      -0.0        0.04 ± 45%  perf-profile.self.cycles-pp.__x64_sys_ioctl
      0.07 ± 15%      +0.0        0.10 ±  9%  perf-profile.self.cycles-pp.folio_alloc
      0.28 ± 10%      +0.1        0.36 ±  7%  perf-profile.self.cycles-pp.get_page_from_freelist
      0.26 ±  8%      +0.1        0.36 ±  4%  perf-profile.self.cycles-pp.__mod_zone_page_state
      0.22 ± 14%      +0.2        0.38 ±  5%  perf-profile.self.cycles-pp.__list_add_valid_or_report
      0.00            +0.2        0.24 ±  6%  perf-profile.self.cycles-pp.free_pcppages_bulk
      0.32 ± 15%      +0.6        0.91 ±  4%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.00            +0.6        0.62 ± 10%  perf-profile.self.cycles-pp.rmqueue_bulk
      0.55 ±  6%      +0.9        1.46 ±  5%  perf-profile.self.cycles-pp.__list_del_entry_valid_or_report
      0.00            +1.3        1.32 ±  4%  perf-profile.self.cycles-pp.__free_one_page





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ