lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202306041623.cdc379d-oliver.sang@intel.com>
Date:   Sun, 4 Jun 2023 17:25:40 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>, Eric Dumazet <edumazet@...gle.com>,
        <ying.huang@...el.com>, <feng.tang@...el.com>,
        <fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [x86]  47ee3f1dd9:
 phoronix-test-suite.ior.2MB.DefaultTestDirectory.mb_s -21.7% regression


hi, Linus,

we reported "[linus:master] [x86] adfcf4231b: blogbench.read_score -10.9% regression"
on
https://lore.kernel.org/lkml/202305041446.71d46724-yujie.liu@intel.com/
however, as you pointed out, the blogbench is a "*horrifically* bad benchmark
for this case", Feng Tang also made a debug patch to confirm this.

now we noticed 47ee3f1dd9 is fix patch for adfcf4231b, just as we found for
adfcf4231b, this commit could also cause performance regression or improvement
for different cases. actually only this case is a regression, all others are
improvement (we normally pick regression as title to report).

below are detail data. hope they could be for your information about the
possible performance impact of this change.


Hello,

kernel test robot noticed a -21.7% regression of phoronix-test-suite.ior.2MB.DefaultTestDirectory.mb_s on:


commit: 47ee3f1dd93bcbe031539b1ecdaafb44b661c772 ("x86: re-introduce support for ERMS copies for user space accesses")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: phoronix-test-suite
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory
parameters:

	test: ior-1.1.1
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 11.8% improvement                                            |
| test machine     | 104 threads 2 sockets (Skylake) with 192G memory                                                          |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | mode=process                                                                                              |
|                  | nr_task=16                                                                                                |
|                  | test=pread3                                                                                               |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | nepim: nepim.tcp.avg.kbps_out 9.5% improvement                                                            |
| test machine     | 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 16G memory                     |
| test parameters  | cluster=cs-localhost                                                                                      |
|                  | cpufreq_governor=performance                                                                              |
|                  | nr_threads=40%                                                                                            |
|                  | protocol=tcp6                                                                                             |
|                  | runtime=300s                                                                                              |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.intel-mpi.IMB-MPI1Exchange.average_mbytes_sec 21.6% improvement  |
| test machine     | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory                 |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | option_a=IMB-MPI1 Exchange                                                                                |
|                  | test=intel-mpi-1.0.1                                                                                      |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 4.6% improvement                                             |
| test machine     | 104 threads 2 sockets (Skylake) with 192G memory                                                          |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | mode=process                                                                                              |
|                  | nr_task=50%                                                                                               |
|                  | test=readseek1                                                                                            |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.tiobench.Write.64MB.8.mb_s 43.5% improvement                     |
| test machine     | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory                 |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | option_a=Write                                                                                            |
|                  | option_b=64MB                                                                                             |
|                  | option_c=8                                                                                                |
|                  | test=tiobench-1.3.1                                                                                       |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.tiobench.RandomWrite.32MB.32.mb_s 49.5% improvement              |
| test machine     | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory                 |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | option_a=Random Write                                                                                     |
|                  | option_b=32MB                                                                                             |
|                  | option_c=4                                                                                                |
|                  | test=tiobench-1.3.1                                                                                       |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | lmbench3: lmbench3.AF_UNIX.sock.stream.bandwidth.MB/sec 9.2% improvement                                  |
| test machine     | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory           |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | mode=development                                                                                          |
|                  | nr_threads=1                                                                                              |
|                  | test=UNIX                                                                                                 |
|                  | test_memory_size=50%                                                                                      |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.tiobench.RandomWrite.64MB.32.mb_s 51.1% improvement              |
| test machine     | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory                 |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | option_a=Random Write                                                                                     |
|                  | option_b=64MB                                                                                             |
|                  | option_c=4                                                                                                |
|                  | test=tiobench-1.3.1                                                                                       |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.tiobench.Write.64MB.32.mb_s 58.5% improvement                    |
| test machine     | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory                 |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | option_a=Write                                                                                            |
|                  | option_b=64MB                                                                                             |
|                  | option_c=4                                                                                                |
|                  | test=tiobench-1.3.1                                                                                       |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | nepim: nepim.tcp.avg.kbps_out 10.3% improvement                                                           |
| test machine     | 8 threads 1 sockets Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (Haswell) with 8G memory                     |
| test parameters  | cluster=cs-localhost                                                                                      |
|                  | cpufreq_governor=performance                                                                              |
|                  | nr_threads=25%                                                                                            |
|                  | protocol=tcp                                                                                              |
|                  | runtime=300s                                                                                              |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.tiobench.Write.32MB.32.mb_s 57.3% improvement                    |
| test machine     | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory                 |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | option_a=Write                                                                                            |
|                  | option_b=32MB                                                                                             |
|                  | option_c=4                                                                                                |
|                  | test=tiobench-1.3.1                                                                                       |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.tiobench.RandomWrite.64MB.8.mb_s 45.1% improvement               |
| test machine     | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory                 |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | option_a=Random Write                                                                                     |
|                  | option_b=64MB                                                                                             |
|                  | option_c=8                                                                                                |
|                  | test=tiobench-1.3.1                                                                                       |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.x11perf.500pxPutImageSquare.operations___second 8.4% improvement |
| test machine     | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory                |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | need_x=true                                                                                               |
|                  | option_a=500px PutImage Square                                                                            |
|                  | test=x11perf-1.1.1                                                                                        |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 14.1% improvement                                             |
| test machine     | 104 threads 2 sockets (Skylake) with 192G memory                                                          |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | mode=thread                                                                                               |
|                  | nr_task=50%                                                                                               |
|                  | test=pread1                                                                                               |
+------------------+-----------------------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 7.8% improvement                                              |
| test machine     | 104 threads 2 sockets (Skylake) with 192G memory                                                          |
| test parameters  | cpufreq_governor=performance                                                                              |
|                  | mode=thread                                                                                               |
|                  | nr_task=100%                                                                                              |
|                  | test=readseek1                                                                                            |
+------------------+-----------------------------------------------------------------------------------------------------------+


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202306041623.cdc379d-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/debian-x86_64-phoronix/lkp-csl-2sp7/ior-1.1.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      3626           -22.0%       2828        phoronix-test-suite.ior.2MB./opt/rootfs.mb_s
      3586           -21.7%       2808        phoronix-test-suite.ior.2MB.DefaultTestDirectory.mb_s
   4487378            -3.5%    4330977        perf-stat.i.cache-misses
  76327545            -1.2%   75402170        perf-stat.i.cache-references
    292483 ±  3%     -16.0%     245543 ±  5%  perf-stat.i.node-stores
      5.88            -0.1        5.74        perf-stat.overall.cache-miss-rate%
      1462            +3.5%       1513        perf-stat.overall.cycles-between-cache-misses
     35.88            +4.9       40.77 ±  5%  perf-stat.overall.node-store-miss-rate%
   4387788            -3.6%    4229799        perf-stat.ps.cache-misses
    285917 ±  4%     -16.2%     239711 ±  5%  perf-stat.ps.node-stores


***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/pread3/will-it-scale

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      6.01            +0.7        6.71 ±  2%  mpstat.cpu.all.usr%
      0.22           -42.4%       0.13 ±  3%  turbostat.IPC
  13556369           +11.8%   15162679 ±  2%  will-it-scale.16.processes
    847272           +11.8%     947666 ±  2%  will-it-scale.per_process_ops
  13556369           +11.8%   15162679 ±  2%  will-it-scale.workload
      1.30 ±  2%     +75.8%       2.28 ±  3%  perf-stat.i.MPKI
 4.457e+09           -10.9%  3.969e+09 ±  2%  perf-stat.i.branch-instructions
      1.15            +0.3        1.41        perf-stat.i.branch-miss-rate%
  51357767            +9.3%   56148787        perf-stat.i.branch-misses
      1718            -0.8%       1704        perf-stat.i.context-switches
      1.27           +76.6%       2.25 ±  2%  perf-stat.i.cpi
      0.12            +0.1        0.27        perf-stat.i.dTLB-load-miss-rate%
  14889205           +10.7%   16475643 ±  2%  perf-stat.i.dTLB-load-misses
 1.248e+10           -50.3%  6.202e+09 ±  2%  perf-stat.i.dTLB-loads
      0.00            +0.0        0.00 ±  2%  perf-stat.i.dTLB-store-miss-rate%
 1.009e+10           -65.0%  3.534e+09 ±  2%  perf-stat.i.dTLB-stores
  14680606           +14.5%   16815386 ±  5%  perf-stat.i.iTLB-load-misses
  33983782 ±  5%     +14.8%   39004142 ±  4%  perf-stat.i.iTLB-loads
 3.708e+10           -43.3%  2.101e+10 ±  2%  perf-stat.i.instructions
      2528           -50.4%       1254 ±  3%  perf-stat.i.instructions-per-iTLB-miss
      0.78           -43.3%       0.44 ±  2%  perf-stat.i.ipc
    790.06 ±  3%      +5.7%     835.02 ±  2%  perf-stat.i.metric.K/sec
    259.91           -49.3%     131.77 ±  2%  perf-stat.i.metric.M/sec
      1.30 ±  2%     +75.3%       2.27 ±  3%  perf-stat.overall.MPKI
      1.15            +0.3        1.41        perf-stat.overall.branch-miss-rate%
      1.27           +76.4%       2.25 ±  2%  perf-stat.overall.cpi
      0.12            +0.1        0.26        perf-stat.overall.dTLB-load-miss-rate%
      0.00            +0.0        0.00 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
      2525           -50.4%       1251 ±  3%  perf-stat.overall.instructions-per-iTLB-miss
      0.78           -43.3%       0.44 ±  2%  perf-stat.overall.ipc
    822348           -49.3%     416945        perf-stat.overall.path-length
 4.442e+09           -10.9%  3.956e+09 ±  2%  perf-stat.ps.branch-instructions
  51175390            +9.3%   55945707        perf-stat.ps.branch-misses
  14839395           +10.7%   16420381 ±  2%  perf-stat.ps.dTLB-load-misses
 1.244e+10           -50.3%  6.181e+09 ±  2%  perf-stat.ps.dTLB-loads
 1.006e+10           -65.0%  3.522e+09 ±  2%  perf-stat.ps.dTLB-stores
  14630964           +14.5%   16757456 ±  5%  perf-stat.ps.iTLB-load-misses
  33871661 ±  5%     +14.8%   38879673 ±  4%  perf-stat.ps.iTLB-loads
 3.695e+10           -43.3%  2.094e+10 ±  2%  perf-stat.ps.instructions
 1.115e+13           -43.3%  6.322e+12 ±  2%  perf-stat.total.instructions
     11.76 ±  4%      -7.2        4.59 ± 36%  perf-profile.calltrace.cycles-pp.rep_movs_alternative.copyout._copy_to_iter.copy_page_to_iter.shmem_file_read_iter
     12.22 ±  4%      -7.1        5.13 ± 31%  perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.vfs_read
     12.65 ±  4%      -7.0        5.61 ± 28%  perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.vfs_read.__x64_sys_pread64
     12.94 ±  4%      -7.0        5.94 ± 26%  perf-profile.calltrace.cycles-pp.copy_page_to_iter.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
     20.53 ±  3%      -6.3       14.28 ±  9%  perf-profile.calltrace.cycles-pp.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
     25.48 ±  4%      -5.7       19.80 ±  6%  perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
     27.64 ±  3%      -5.5       22.16 ±  5%  perf-profile.calltrace.cycles-pp.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
     41.36 ±  4%      -4.0       37.37 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
      1.09 ±  3%      +0.1        1.22 ±  4%  perf-profile.calltrace.cycles-pp.__fget_light.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
      1.60 ±  2%      +0.1        1.74 ±  3%  perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64
      0.95 ±  3%      +0.2        1.11 ± 10%  perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.62 ±  4%      +0.2        1.78 ±  3%  perf-profile.calltrace.cycles-pp.touch_atime.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
      1.97 ±  4%      +0.2        2.18 ±  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__libc_pread
      0.35 ± 70%      +0.2        0.58 ±  4%  perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
      2.85 ±  2%      +0.3        3.18 ±  3%  perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
      6.65 ±  4%      +0.7        7.35 ±  4%  perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_pread
     12.30 ±  5%      +1.3       13.62 ±  4%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
     13.67 ±  5%      +1.5       15.13 ±  4%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pread
     11.77 ±  4%      -7.1        4.63 ± 35%  perf-profile.children.cycles-pp.rep_movs_alternative
     12.47 ±  4%      -7.0        5.42 ± 29%  perf-profile.children.cycles-pp.copyout
     12.67 ±  4%      -7.0        5.64 ± 28%  perf-profile.children.cycles-pp._copy_to_iter
     12.96 ±  4%      -7.0        5.96 ± 26%  perf-profile.children.cycles-pp.copy_page_to_iter
     20.61 ±  3%      -6.2       14.38 ±  9%  perf-profile.children.cycles-pp.shmem_file_read_iter
     25.58 ±  4%      -5.7       19.90 ±  5%  perf-profile.children.cycles-pp.vfs_read
     27.65 ±  3%      -5.5       22.18 ±  5%  perf-profile.children.cycles-pp.__x64_sys_pread64
     41.52 ±  4%      -4.0       37.53 ±  2%  perf-profile.children.cycles-pp.do_syscall_64
      0.54 ±  5%      +0.1        0.61 ±  4%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      1.01 ±  4%      +0.1        1.12 ±  3%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.09 ±  4%      +0.1        1.22 ±  4%  perf-profile.children.cycles-pp.__fget_light
      1.62 ±  2%      +0.1        1.76 ±  3%  perf-profile.children.cycles-pp.filemap_get_entry
      0.95 ±  3%      +0.2        1.12 ± 10%  perf-profile.children.cycles-pp.__fsnotify_parent
      1.64 ±  4%      +0.2        1.82 ±  3%  perf-profile.children.cycles-pp.touch_atime
      2.89 ±  2%      +0.3        3.23 ±  3%  perf-profile.children.cycles-pp.shmem_get_folio_gfp
      6.55 ±  4%      +0.7        7.24 ±  4%  perf-profile.children.cycles-pp.__entry_text_start
     12.44 ±  5%      +1.3       13.76 ±  4%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     13.81 ±  5%      +1.5       15.28 ±  4%  perf-profile.children.cycles-pp.syscall_return_via_sysret
     11.60 ±  4%      -7.1        4.46 ± 37%  perf-profile.self.cycles-pp.rep_movs_alternative
      0.48 ±  5%      +0.1        0.55 ±  4%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.45 ±  7%      +0.1        0.52 ±  2%  perf-profile.self.cycles-pp.current_time
      0.26 ±  5%      +0.1        0.34 ± 11%  perf-profile.self.cycles-pp.xas_load
      0.87 ±  4%      +0.1        0.97 ±  3%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.11 ±  4%      +0.1        1.21 ±  3%  perf-profile.self.cycles-pp.__libc_pread
      0.82 ±  4%      +0.1        0.92 ±  5%  perf-profile.self.cycles-pp.copyout
      1.09 ±  3%      +0.1        1.21 ±  4%  perf-profile.self.cycles-pp.__fget_light
      1.13 ±  3%      +0.2        1.29 ±  4%  perf-profile.self.cycles-pp.shmem_get_folio_gfp
      0.91 ±  4%      +0.2        1.07 ± 10%  perf-profile.self.cycles-pp.__fsnotify_parent
      5.71 ±  4%      +0.6        6.31 ±  4%  perf-profile.self.cycles-pp.__entry_text_start
      7.81 ±  5%      +0.9        8.68 ±  5%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
     11.95 ±  5%      +1.3       13.21 ±  4%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
     13.79 ±  5%      +1.5       15.26 ±  4%  perf-profile.self.cycles-pp.syscall_return_via_sysret



***************************************************************************************************
lkp-hsw-d03: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 16G memory
=========================================================================================
cluster/compiler/cpufreq_governor/kconfig/nr_threads/protocol/rootfs/runtime/tbox_group/testcase:
  cs-localhost/gcc-11/performance/x86_64-rhel-8.3/40%/tcp6/debian-11.1-x86_64-20220510.cgz/300s/lkp-hsw-d03/nepim

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     40382            +5.8%      42719        vmstat.system.cs
     29.19            +1.1%      29.51        turbostat.CorWatt
      0.28           -45.7%       0.15        turbostat.IPC
  51340235            +9.5%   56202298        proc-vmstat.numa_hit
  51317600            +9.5%   56198357        proc-vmstat.numa_local
 4.088e+08            +9.5%  4.478e+08        proc-vmstat.pgalloc_normal
 4.088e+08            +9.5%  4.478e+08        proc-vmstat.pgfree
   7501256            +9.5%    8214645        nepim.tcp.avg.kbps_in
   7501474            +9.5%    8214867        nepim.tcp.avg.kbps_out
     28758            +9.4%      31464        nepim.tcp.avg.rcv_s
     28615            +9.5%      31337        nepim.tcp.avg.snd_s
      1895 ± 21%     +21.9%       2309 ± 13%  nepim.time.involuntary_context_switches
     14.45 ±  5%      -2.3       12.20 ±  5%  perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write
     14.05 ±  4%      -2.2       11.86 ±  5%  perf-profile.calltrace.cycles-pp._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter
     13.88 ±  5%      -2.1       11.75 ±  5%  perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg
     13.84 ±  4%      -2.1       11.71 ±  5%  perf-profile.calltrace.cycles-pp.rep_movs_alternative.copyin._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked
     34.14 ±  3%      -3.4       30.78 ±  3%  perf-profile.children.cycles-pp.rep_movs_alternative
     14.46 ±  5%      -2.3       12.20 ±  5%  perf-profile.children.cycles-pp.skb_do_copy_data_nocache
     14.06 ±  4%      -2.2       11.86 ±  5%  perf-profile.children.cycles-pp._copy_from_iter
     13.92 ±  4%      -2.2       11.76 ±  5%  perf-profile.children.cycles-pp.copyin
      0.07 ± 21%      +0.0        0.11 ± 17%  perf-profile.children.cycles-pp._raw_spin_lock_irq
      0.03 ±127%      +0.1        0.09 ± 36%  perf-profile.children.cycles-pp.rcu_all_qs
      0.31 ±  6%      +0.1        0.38 ±  8%  perf-profile.children.cycles-pp.aa_sk_perm
     33.97 ±  3%      -3.5       30.51 ±  4%  perf-profile.self.cycles-pp.rep_movs_alternative
      0.15 ± 12%      -0.0        0.10 ± 22%  perf-profile.self.cycles-pp._copy_from_iter
      0.10 ± 13%      +0.0        0.14 ± 18%  perf-profile.self.cycles-pp.process_backlog
      0.07 ± 18%      +0.0        0.11 ± 16%  perf-profile.self.cycles-pp._raw_spin_lock_irq
      0.03 ±124%      +0.1        0.09 ± 20%  perf-profile.self.cycles-pp.__release_sock
      0.32 ± 11%      +0.1        0.38 ± 10%  perf-profile.self.cycles-pp._raw_spin_lock_bh
      0.28 ±  6%      +0.1        0.35 ±  9%  perf-profile.self.cycles-pp.aa_sk_perm
      0.77 ± 12%      +0.4        1.19 ±  8%  perf-profile.self.cycles-pp.tcp_sendmsg_locked
     25.41           +57.6%      40.05        perf-stat.i.MPKI
 8.761e+08           -13.6%  7.568e+08        perf-stat.i.branch-instructions
      1.71            +0.3        2.03        perf-stat.i.branch-miss-rate%
     13.05 ±  2%      +3.7       16.74        perf-stat.i.cache-miss-rate%
  23909839            +5.5%   25223669 ±  2%  perf-stat.i.cache-misses
 1.834e+08           -17.8%  1.507e+08        perf-stat.i.cache-references
     40658            +5.9%      43055        perf-stat.i.context-switches
      1.02           +91.3%       1.95        perf-stat.i.cpi
    315.87            -5.2%     299.30        perf-stat.i.cycles-between-cache-misses
      0.10 ±  3%      +0.1        0.17        perf-stat.i.dTLB-load-miss-rate%
 2.492e+09 ±  4%     -39.8%    1.5e+09        perf-stat.i.dTLB-loads
      0.04            +0.0        0.07        perf-stat.i.dTLB-store-miss-rate%
    717270            +9.2%     783320        perf-stat.i.dTLB-store-misses
 2.022e+09           -47.7%  1.057e+09        perf-stat.i.dTLB-stores
    897381            +7.0%     959782        perf-stat.i.iTLB-loads
 7.305e+09           -46.8%  3.888e+09        perf-stat.i.instructions
      0.98           -47.3%       0.52        perf-stat.i.ipc
    806.16 ±  2%     -58.0%     338.57 ±  2%  perf-stat.i.metric.K/sec
    699.04 ±  2%     -37.6%     435.98        perf-stat.i.metric.M/sec
  18383434           +28.2%   23575415        perf-stat.i.node-loads
   4791695 ±  3%     -81.1%     904896 ±  6%  perf-stat.i.node-stores
     25.11           +54.4%      38.77        perf-stat.overall.MPKI
      1.83            +0.3        2.17        perf-stat.overall.branch-miss-rate%
     13.04 ±  2%      +3.7       16.74        perf-stat.overall.cache-miss-rate%
      1.02           +88.1%       1.91        perf-stat.overall.cpi
    310.68            -5.1%     294.84        perf-stat.overall.cycles-between-cache-misses
      0.10 ±  3%      +0.1        0.17        perf-stat.overall.dTLB-load-miss-rate%
      0.04            +0.0        0.07        perf-stat.overall.dTLB-store-miss-rate%
      0.98           -46.8%       0.52        perf-stat.overall.ipc
 8.732e+08           -13.6%  7.543e+08        perf-stat.ps.branch-instructions
  23830434            +5.5%   25139918 ±  2%  perf-stat.ps.cache-misses
 1.828e+08           -17.8%  1.502e+08        perf-stat.ps.cache-references
     40523            +5.9%      42913        perf-stat.ps.context-switches
 2.484e+09 ±  4%     -39.8%  1.495e+09        perf-stat.ps.dTLB-loads
    714887            +9.2%     780718        perf-stat.ps.dTLB-store-misses
 2.016e+09           -47.7%  1.053e+09        perf-stat.ps.dTLB-stores
    894402            +7.0%     956596        perf-stat.ps.iTLB-loads
 7.281e+09           -46.8%  3.875e+09        perf-stat.ps.instructions
  18322375           +28.2%   23497100        perf-stat.ps.node-loads
   4775792 ±  3%     -81.1%     901896 ±  6%  perf-stat.ps.node-stores
 2.194e+12           -46.8%  1.167e+12        perf-stat.total.instructions



***************************************************************************************************
lkp-cfl-e1: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/IMB-MPI1 Exchange/debian-x86_64-phoronix/lkp-cfl-e1/intel-mpi-1.0.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    678.50 ± 17%     -45.7%     368.50 ± 62%  turbostat.C10
      9064           +21.6%      11018        phoronix-test-suite.intel-mpi.IMB-MPI1Exchange.average_mbytes_sec
    177.33            -7.6%     163.84        phoronix-test-suite.intel-mpi.IMB-MPI1Exchange.average_usec
      8.86 ± 97%      -7.8        1.00 ±142%  perf-profile.calltrace.cycles-pp.begin_new_exec.load_elf_binary.search_binary_handler.exec_binprm.bprm_execve
      6.77 ± 79%      -6.2        0.56 ±223%  perf-profile.calltrace.cycles-pp.__mmput.exec_mmap.begin_new_exec.load_elf_binary.search_binary_handler
      6.77 ± 79%      -6.2        0.56 ±223%  perf-profile.calltrace.cycles-pp.exec_mmap.begin_new_exec.load_elf_binary.search_binary_handler.exec_binprm
      7.08 ± 81%      -4.1        2.96 ±158%  perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput
      7.08 ± 81%      -4.1        2.96 ±158%  perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap
      5.80 ± 77%      -2.8        2.96 ±158%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
      6.14 ± 66%      -2.3        3.84 ±142%  perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
      8.86 ± 97%      -7.8        1.00 ±142%  perf-profile.children.cycles-pp.begin_new_exec
      6.77 ± 79%      -6.2        0.56 ±223%  perf-profile.children.cycles-pp.exec_mmap
      9.06 ± 80%      -5.2        3.84 ±142%  perf-profile.children.cycles-pp.unmap_vmas
      7.08 ± 81%      -4.1        2.96 ±158%  perf-profile.children.cycles-pp.unmap_page_range
      7.08 ± 81%      -4.1        2.96 ±158%  perf-profile.children.cycles-pp.zap_pmd_range
      7.08 ± 81%      -4.1        2.96 ±158%  perf-profile.children.cycles-pp.zap_pte_range
      4.10 ± 62%      -2.5        1.59 ±157%  perf-profile.children.cycles-pp.release_pages
      3.63 ± 85%      -1.0        2.62 ±149%  perf-profile.children.cycles-pp.tlb_finish_mmu
      3.63 ± 85%      -1.0        2.62 ±149%  perf-profile.children.cycles-pp.tlb_batch_pages_flush
   8776475            -7.0%    8158830        perf-stat.i.cache-misses
 2.749e+08            -3.7%  2.647e+08        perf-stat.i.cache-references
      1.55            +5.2%       1.63 ±  2%  perf-stat.i.cpi
  9.08e+09            -1.5%  8.944e+09        perf-stat.i.dTLB-loads
      0.03 ±  3%      +0.0        0.04 ±  4%  perf-stat.i.dTLB-store-miss-rate%
 4.758e+09            -2.5%   4.64e+09        perf-stat.i.dTLB-stores
    216085            -2.3%     211119        perf-stat.i.iTLB-loads
 2.562e+10            -1.5%  2.524e+10        perf-stat.i.instructions
      1.13            -2.8%       1.10        perf-stat.i.ipc
    983.28            -2.5%     958.80        perf-stat.i.metric.M/sec
    750185          +203.4%    2276128        perf-stat.i.node-loads
   1661980           +28.8%    2141116        perf-stat.i.node-stores
     10.73            -2.3%      10.49        perf-stat.overall.MPKI
      3.20            -0.1        3.08        perf-stat.overall.cache-miss-rate%
      2088            +6.7%       2227        perf-stat.overall.cycles-between-cache-misses
      0.00 ±  9%      -0.0        0.00 ± 10%  perf-stat.overall.node-load-miss-rate%
   8684309            -7.1%    8069768        perf-stat.ps.cache-misses
 2.718e+08            -3.7%  2.617e+08        perf-stat.ps.cache-references
 8.975e+09            -1.5%   8.84e+09        perf-stat.ps.dTLB-loads
 4.703e+09            -2.5%  4.587e+09        perf-stat.ps.dTLB-stores
    213618            -2.3%     208713        perf-stat.ps.iTLB-loads
 2.532e+10            -1.5%  2.495e+10        perf-stat.ps.instructions
    742356          +203.3%    2251351        perf-stat.ps.node-loads
   1645341           +28.7%    2117834        perf-stat.ps.node-stores
 2.261e+12            -1.3%  2.232e+12        perf-stat.total.instructions



***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/readseek1/will-it-scale

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   5492754 ± 14%     -17.6%    4524182 ± 14%  sched_debug.cfs_rq:/.spread0.max
      0.16           -43.8%       0.09        turbostat.IPC
     14210 ±  2%      -9.0%      12931        turbostat.POLL
  26574376            +4.6%   27809208        will-it-scale.52.processes
    511045            +4.6%     534792        will-it-scale.per_process_ops
  26574376            +4.6%   27809208        will-it-scale.workload
      0.06           +79.4%       0.11 ± 12%  perf-stat.i.MPKI
 1.024e+10           -13.3%  8.877e+09        perf-stat.i.branch-instructions
      1.34            +0.3        1.62        perf-stat.i.branch-miss-rate%
 1.374e+08            +4.5%  1.436e+08        perf-stat.i.branch-misses
      1.80           +71.0%       3.07        perf-stat.i.cpi
      0.20            +0.2        0.40        perf-stat.i.dTLB-load-miss-rate%
  53209330            +4.6%   55668630        perf-stat.i.dTLB-load-misses
 2.683e+10           -48.3%  1.388e+10        perf-stat.i.dTLB-loads
      0.00            +0.0        0.00        perf-stat.i.dTLB-store-miss-rate%
     37320            +2.7%      38319        perf-stat.i.dTLB-store-misses
 2.138e+10           -61.8%  8.175e+09        perf-stat.i.dTLB-stores
  58111654            +3.6%   60212884        perf-stat.i.iTLB-load-misses
   8.1e+10           -41.5%  4.738e+10        perf-stat.i.instructions
      1395           -43.6%     787.58        perf-stat.i.instructions-per-iTLB-miss
      0.56           -41.5%       0.33        perf-stat.i.ipc
    563.05           -47.0%     298.51        perf-stat.i.metric.M/sec
    178317            +3.2%     183996 ±  2%  perf-stat.i.node-load-misses
      0.06           +79.8%       0.11 ± 11%  perf-stat.overall.MPKI
      1.34            +0.3        1.62        perf-stat.overall.branch-miss-rate%
      1.80           +71.0%       3.07        perf-stat.overall.cpi
      0.20            +0.2        0.40        perf-stat.overall.dTLB-load-miss-rate%
      0.00            +0.0        0.00        perf-stat.overall.dTLB-store-miss-rate%
      1394           -43.6%     786.84        perf-stat.overall.instructions-per-iTLB-miss
      0.56           -41.5%       0.33        perf-stat.overall.ipc
    916671           -44.1%     512426        perf-stat.overall.path-length
  1.02e+10           -13.3%  8.847e+09        perf-stat.ps.branch-instructions
  1.37e+08            +4.5%  1.431e+08        perf-stat.ps.branch-misses
  53029639            +4.6%   55478925        perf-stat.ps.dTLB-load-misses
 2.674e+10           -48.3%  1.383e+10        perf-stat.ps.dTLB-loads
     37232            +2.6%      38204        perf-stat.ps.dTLB-store-misses
 2.131e+10           -61.8%  8.147e+09        perf-stat.ps.dTLB-stores
  57916551            +3.6%   60006718        perf-stat.ps.iTLB-load-misses
 8.073e+10           -41.5%  4.721e+10        perf-stat.ps.instructions
    177742            +3.2%     183382 ±  2%  perf-stat.ps.node-load-misses
 2.436e+13           -41.5%  1.425e+13        perf-stat.total.instructions
      7.47            -3.1        4.40        perf-profile.calltrace.cycles-pp.rep_movs_alternative.copyout._copy_to_iter.copy_page_to_iter.shmem_file_read_iter
      7.73            -3.0        4.68        perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.vfs_read
      8.07            -3.0        5.03        perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.vfs_read.ksys_read
      8.25            -3.0        5.22        perf-profile.calltrace.cycles-pp.copy_page_to_iter.shmem_file_read_iter.vfs_read.ksys_read.do_syscall_64
     12.67            -2.8        9.90        perf-profile.calltrace.cycles-pp.shmem_file_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
     15.82            -2.6       13.24        perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     16.94            -2.5       14.43        perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     26.01            -2.1       23.93        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     30.63            -1.8       28.83        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
     45.06            -1.1       43.92        perf-profile.calltrace.cycles-pp.read
      0.55 ±  2%      +0.0        0.59 ±  2%  perf-profile.calltrace.cycles-pp.shmem_file_llseek.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      0.67 ±  2%      +0.0        0.71        perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.ksys_read
      0.59 ±  2%      +0.0        0.63 ±  2%  perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.67 ±  3%      +0.0        0.72        perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.76            +0.0        0.81        perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.shmem_file_read_iter.vfs_read.ksys_read
      0.82 ±  3%      +0.1        0.88        perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
      0.97 ±  2%      +0.1        1.03        perf-profile.calltrace.cycles-pp.touch_atime.shmem_file_read_iter.vfs_read.ksys_read.do_syscall_64
      1.60            +0.1        1.70        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.ksys_read.do_syscall_64
      1.62            +0.1        1.72        perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      0.78            +0.1        0.90 ± 22%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.read
      0.76 ±  2%      +0.1        0.90 ± 23%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.llseek
      4.14            +0.2        4.33        perf-profile.calltrace.cycles-pp.__entry_text_start.read
      4.08            +0.3        4.37        perf-profile.calltrace.cycles-pp.__entry_text_start.llseek
      7.80            +0.4        8.15        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
      8.57            +0.4        8.97        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.read
      7.60            +0.5        8.05        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
     10.43            +0.6       11.04        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      8.37            +0.6        9.00        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.llseek
     15.05            +0.8       15.85        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.llseek
     28.78            +2.2       30.95        perf-profile.calltrace.cycles-pp.llseek
      7.47            -3.1        4.42        perf-profile.children.cycles-pp.rep_movs_alternative
      8.08            -3.0        5.05        perf-profile.children.cycles-pp._copy_to_iter
      7.90            -3.0        4.88        perf-profile.children.cycles-pp.copyout
      8.26            -3.0        5.24        perf-profile.children.cycles-pp.copy_page_to_iter
     12.73            -2.8        9.96        perf-profile.children.cycles-pp.shmem_file_read_iter
     15.89            -2.6       13.31        perf-profile.children.cycles-pp.vfs_read
     16.99            -2.5       14.49        perf-profile.children.cycles-pp.ksys_read
     36.59            -1.5       35.12        perf-profile.children.cycles-pp.do_syscall_64
     45.17            -1.1       44.06        perf-profile.children.cycles-pp.read
     46.04            -1.0       45.07        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.10 ±  4%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.__cond_resched
      0.20 ±  3%      +0.0        0.22        perf-profile.children.cycles-pp.folio_test_hugetlb
      0.32            +0.0        0.34        perf-profile.children.cycles-pp.__x64_sys_read
      0.31            +0.0        0.33 ±  2%  perf-profile.children.cycles-pp.__x64_sys_lseek
      0.26 ±  3%      +0.0        0.29 ±  3%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.35 ±  4%      +0.0        0.39        perf-profile.children.cycles-pp.current_time
      0.68            +0.0        0.72 ±  2%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.55 ±  2%      +0.0        0.59 ±  2%  perf-profile.children.cycles-pp.shmem_file_llseek
      0.59            +0.0        0.63 ±  2%  perf-profile.children.cycles-pp.__fsnotify_parent
      0.79            +0.0        0.84        perf-profile.children.cycles-pp.atime_needs_update
      0.68 ±  2%      +0.0        0.73        perf-profile.children.cycles-pp.filemap_get_entry
      1.26            +0.1        1.32        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.99            +0.1        1.05        perf-profile.children.cycles-pp.touch_atime
      1.37 ±  3%      +0.1        1.44        perf-profile.children.cycles-pp.__fget_light
      1.66 ±  3%      +0.1        1.76        perf-profile.children.cycles-pp.__fdget_pos
      1.64            +0.1        1.73        perf-profile.children.cycles-pp.ksys_lseek
      1.54            +0.1        1.63        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      1.62            +0.1        1.72        perf-profile.children.cycles-pp.shmem_get_folio_gfp
      8.09            +0.5        8.58        perf-profile.children.cycles-pp.__entry_text_start
     15.54            +0.8       16.34        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     17.10            +1.0       18.13        perf-profile.children.cycles-pp.syscall_return_via_sysret
     29.24            +1.8       31.07        perf-profile.children.cycles-pp.llseek
      7.34            -3.0        4.29        perf-profile.self.cycles-pp.rep_movs_alternative
      0.21            +0.0        0.23 ±  3%  perf-profile.self.cycles-pp.touch_atime
      0.30            +0.0        0.32        perf-profile.self.cycles-pp.__x64_sys_lseek
      0.66            +0.0        0.68        perf-profile.self.cycles-pp.llseek
      0.20 ±  3%      +0.0        0.22        perf-profile.self.cycles-pp.folio_test_hugetlb
      0.30 ±  3%      +0.0        0.33 ±  3%  perf-profile.self.cycles-pp.__fdget_pos
      0.17 ±  4%      +0.0        0.20 ±  5%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.57 ±  2%      +0.0        0.60        perf-profile.self.cycles-pp.filemap_get_entry
      0.60            +0.0        0.63 ±  2%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.84            +0.0        0.88        perf-profile.self.cycles-pp.do_syscall_64
      0.54 ±  4%      +0.0        0.57        perf-profile.self.cycles-pp.copyout
      0.56            +0.0        0.60 ±  2%  perf-profile.self.cycles-pp.__fsnotify_parent
      0.83 ±  2%      +0.0        0.86        perf-profile.self.cycles-pp.shmem_get_folio_gfp
      0.53 ±  2%      +0.0        0.57 ±  2%  perf-profile.self.cycles-pp.shmem_file_llseek
      0.62            +0.0        0.66        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.71            +0.1        0.76 ±  2%  perf-profile.self.cycles-pp.read
      1.10            +0.1        1.15        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.29            +0.1        1.35 ±  2%  perf-profile.self.cycles-pp.shmem_file_read_iter
      1.32 ±  3%      +0.1        1.39        perf-profile.self.cycles-pp.__fget_light
      1.05            +0.1        1.14        perf-profile.self.cycles-pp.vfs_read
      7.04            +0.4        7.47        perf-profile.self.cycles-pp.__entry_text_start
      9.74            +0.5       10.26        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
     14.94            +0.8       15.70        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
     17.08            +1.0       18.10        perf-profile.self.cycles-pp.syscall_return_via_sysret



***************************************************************************************************
lkp-cfl-e1: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/Write/64MB/8/debian-x86_64-phoronix/lkp-cfl-e1/tiobench-1.3.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     12535 ±  2%     +43.5%      17982 ±  2%  phoronix-test-suite.tiobench.Write.64MB.8.mb_s
   2132097 ± 11%     -26.0%    1578355 ±  9%  perf-stat.i.node-stores
    313.06 ±  7%     +19.9%     375.47 ±  2%  perf-stat.overall.cycles-between-cache-misses
   2086550 ± 12%     -26.6%    1531685 ± 11%  perf-stat.ps.node-stores
      3.75 ±102%      -3.7        0.00        perf-profile.calltrace.cycles-pp.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      5.83 ± 78%      -3.2        2.68 ±171%  perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
      5.83 ± 78%      -3.2        2.68 ±171%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
      8.58 ±107%      -8.0        0.60 ±223%  perf-profile.children.cycles-pp.tlb_finish_mmu
      8.58 ±107%      -8.0        0.60 ±223%  perf-profile.children.cycles-pp.tlb_batch_pages_flush
      8.58 ±107%      -6.5        2.11 ±160%  perf-profile.children.cycles-pp.release_pages
      3.75 ±102%      -3.7        0.00        perf-profile.children.cycles-pp.do_cow_fault



***************************************************************************************************
lkp-cfl-e1: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/Random Write/32MB/4/debian-x86_64-phoronix/lkp-cfl-e1/tiobench-1.3.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1.70 ±  2%      -0.3        1.39 ±  3%  mpstat.cpu.all.sys%
     32.33 ±  4%     -18.6%      26.33 ±  4%  phoronix-test-suite.time.percent_of_cpu_this_job_got
    107378           +49.5%     160566 ±  2%  phoronix-test-suite.tiobench.RandomWrite.32MB.32.mb_s
  10052508 ±  3%     -16.7%    8372597 ±  4%  perf-stat.i.cache-misses
 4.595e+09 ±  2%      -6.3%  4.304e+09 ±  2%  perf-stat.i.cpu-cycles
 3.647e+08 ±  2%      -5.6%  3.444e+08 ±  2%  perf-stat.i.dTLB-stores
      0.29 ±  2%      -6.3%       0.27 ±  2%  perf-stat.i.metric.GHz
      0.00 ± 22%      +0.0        0.01 ± 11%  perf-stat.i.node-load-miss-rate%
      5.67 ± 19%    +282.5%      21.70 ±  7%  perf-stat.i.node-load-misses
      6.24 ± 31%    +232.9%      20.76 ±  9%  perf-stat.i.node-store-misses
   3545056 ±  3%     -20.5%    2818290 ±  7%  perf-stat.i.node-stores
    457.85 ±  5%     +12.5%     514.94 ±  3%  perf-stat.overall.cycles-between-cache-misses
      0.00 ± 18%      +0.0        0.01 ± 11%  perf-stat.overall.node-load-miss-rate%
      0.00 ± 29%      +0.0        0.00 ± 12%  perf-stat.overall.node-store-miss-rate%
   9684141 ±  3%     -16.7%    8068153 ±  4%  perf-stat.ps.cache-misses
 4.427e+09 ±  2%      -6.3%   4.15e+09        perf-stat.ps.cpu-cycles
 3.514e+08 ±  3%      -5.5%   3.32e+08 ±  2%  perf-stat.ps.dTLB-stores
      5.47 ± 19%    +282.5%      20.91 ±  7%  perf-stat.ps.node-load-misses
      6.00 ± 31%    +233.3%      20.01 ±  9%  perf-stat.ps.node-store-misses
   3414551 ±  3%     -20.4%    2718058 ±  8%  perf-stat.ps.node-stores
      8.14 ± 97%      -8.1        0.00        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      7.55 ± 83%      -7.5        0.00        perf-profile.calltrace.cycles-pp.free_swap_cache.free_pages_and_swap_cache.tlb_batch_pages_flush.tlb_finish_mmu.exit_mmap
      6.48 ±137%      -6.5        0.00        perf-profile.calltrace.cycles-pp.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      9.33 ± 62%      -6.3        3.02 ±173%  perf-profile.calltrace.cycles-pp.tlb_finish_mmu.exit_mmap.__mmput.exit_mm.do_exit
      9.33 ± 62%      -6.3        3.02 ±173%  perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.exit_mmap.__mmput.exit_mm
      6.82 ± 73%      -4.3        2.49 ±164%  perf-profile.calltrace.cycles-pp.load_elf_binary.search_binary_handler.exec_binprm.bprm_execve.do_execveat_common
      8.14 ± 97%      -2.7        5.45 ±168%  perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      5.48 ±114%      -2.5        3.02 ±173%  perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.exit_mmap.__mmput
      9.34 ± 91%      -2.3        7.04 ±141%  perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.44 ± 81%      +0.0        5.45 ±168%  perf-profile.calltrace.cycles-pp.asm_exc_page_fault
      5.44 ± 81%      +0.0        5.45 ±168%  perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
      5.44 ± 81%      +0.0        5.45 ±168%  perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      7.65 ± 65%      +0.1        7.78 ±202%  perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.65 ± 65%      +0.1        7.78 ±202%  perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.65 ± 65%      +0.1        7.78 ±202%  perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
      7.65 ± 65%      +0.1        7.78 ±202%  perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
      7.65 ± 65%      +0.1        7.78 ±202%  perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
      4.67 ± 91%      +0.9        5.56 ±223%  perf-profile.calltrace.cycles-pp.rep_movs_alternative.copyin.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter
      4.67 ± 91%      +0.9        5.56 ±223%  perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.vfs_write
      4.67 ± 91%      +0.9        5.56 ±223%  perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      4.40 ±110%      +1.0        5.45 ±168%  perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      8.14 ± 97%      -8.1        0.00        perf-profile.children.cycles-pp.do_fault
     10.72 ± 74%      -7.7        3.02 ±173%  perf-profile.children.cycles-pp.tlb_finish_mmu
     10.72 ± 74%      -7.7        3.02 ±173%  perf-profile.children.cycles-pp.tlb_batch_pages_flush
      6.48 ±137%      -6.5        0.00        perf-profile.children.cycles-pp.do_read_fault
      5.23 ± 84%      -5.2        0.00        perf-profile.children.cycles-pp.free_pages_and_swap_cache
      5.23 ± 84%      -5.2        0.00        perf-profile.children.cycles-pp.free_swap_cache
      6.82 ± 73%      -4.3        2.49 ±164%  perf-profile.children.cycles-pp.load_elf_binary
      3.97 ± 78%      -4.0        0.00        perf-profile.children.cycles-pp.page_cache_ra_unbounded
      5.49 ±114%      -2.5        3.02 ±173%  perf-profile.children.cycles-pp.release_pages
      9.34 ± 91%      -2.3        7.04 ±141%  perf-profile.children.cycles-pp.poll_idle
      8.14 ± 97%      -2.0        6.09 ±155%  perf-profile.children.cycles-pp.__handle_mm_fault
      7.65 ± 65%      +0.1        7.78 ±202%  perf-profile.children.cycles-pp.__x64_sys_exit_group
      4.67 ± 91%      +0.9        5.56 ±223%  perf-profile.children.cycles-pp.rep_movs_alternative
      4.67 ± 91%      +0.9        5.56 ±223%  perf-profile.children.cycles-pp.copy_page_from_iter_atomic
      4.67 ± 91%      +0.9        5.56 ±223%  perf-profile.children.cycles-pp.copyin
      9.34 ± 91%      -2.3        7.04 ±141%  perf-profile.self.cycles-pp.poll_idle
      4.67 ± 91%      +0.9        5.56 ±223%  perf-profile.self.cycles-pp.rep_movs_alternative



***************************************************************************************************
lkp-ivb-2ep1: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase:
  gcc-11/performance/x86_64-rhel-8.3/development/1/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/UNIX/50%/lmbench3

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      6375            +9.2%       6959        lmbench3.AF_UNIX.sock.stream.bandwidth.MB/sec
    597101 ±  9%     +65.9%     990600 ± 37%  sched_debug.cpu.max_idle_balance_cost.max
     16099 ± 65%    +411.0%      82275 ± 79%  sched_debug.cpu.max_idle_balance_cost.stddev
    178220 ± 10%     -65.7%      61065 ±  6%  turbostat.C1
      0.14 ±  4%      -0.1        0.06 ±  7%  turbostat.C1%
  10200875            +9.2%   11144066 ±  2%  proc-vmstat.numa_hit
  10145413            +8.8%   11033665 ±  2%  proc-vmstat.numa_local
  46796092 ±  2%      +6.2%   49719663        proc-vmstat.pgalloc_normal
  46721572 ±  2%      +6.3%   49645401        proc-vmstat.pgfree
      0.24 ±223%      +2.4        2.62 ± 73%  perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.74 ± 65%      -1.9        0.82 ± 86%  perf-profile.children.cycles-pp.__schedule
      1.44 ± 58%      -1.0        0.42 ±109%  perf-profile.children.cycles-pp.newidle_balance
      0.47 ±115%      +1.0        1.44 ± 62%  perf-profile.children.cycles-pp._raw_spin_lock
      0.84 ± 75%      +1.3        2.14 ± 19%  perf-profile.children.cycles-pp.vsnprintf
      0.72 ± 75%      +1.4        2.08 ± 18%  perf-profile.children.cycles-pp.seq_printf
      1.32 ± 96%      +1.6        2.96 ± 23%  perf-profile.children.cycles-pp.link_path_walk
      0.47 ±115%      +1.0        1.44 ± 62%  perf-profile.self.cycles-pp._raw_spin_lock
     25.20 ± 16%    +123.4%      56.28 ±  7%  perf-stat.i.MPKI
 1.081e+08 ±  4%     +35.4%  1.463e+08 ±  6%  perf-stat.i.cache-references
      0.63 ±  5%      +0.1        0.68 ±  2%  perf-stat.i.dTLB-load-miss-rate%
 1.648e+09 ±  4%     -14.7%  1.406e+09 ±  6%  perf-stat.i.dTLB-loads
   1571133 ±  4%      -8.6%    1435719 ±  5%  perf-stat.i.dTLB-store-misses
 1.187e+09 ±  4%     -20.9%  9.395e+08 ±  6%  perf-stat.i.dTLB-stores
 6.106e+09 ±  4%     -22.7%  4.719e+09 ±  5%  perf-stat.i.instructions
      8412 ±  4%     -26.8%       6159 ±  3%  perf-stat.i.instructions-per-iTLB-miss
      0.66 ±  3%     -23.5%       0.51 ±  4%  perf-stat.i.ipc
     17.70 ±  2%     +75.1%      30.98 ±  2%  perf-stat.overall.MPKI
      4.73            +0.2        4.93        perf-stat.overall.branch-miss-rate%
     20.30 ±  3%      -5.6       14.73 ±  5%  perf-stat.overall.cache-miss-rate%
      1.46 ±  2%     +30.4%       1.90 ±  2%  perf-stat.overall.cpi
      0.51 ±  8%      +0.1        0.59 ±  5%  perf-stat.overall.dTLB-load-miss-rate%
      0.13 ±  2%      +0.0        0.15 ±  3%  perf-stat.overall.dTLB-store-miss-rate%
      7715 ±  3%     -23.8%       5877 ±  3%  perf-stat.overall.instructions-per-iTLB-miss
      0.69 ±  2%     -23.3%       0.53 ±  2%  perf-stat.overall.ipc
 1.064e+08 ±  4%     +35.4%  1.441e+08 ±  6%  perf-stat.ps.cache-references
 1.623e+09 ±  4%     -14.6%  1.386e+09 ±  6%  perf-stat.ps.dTLB-loads
   1547538 ±  4%      -8.6%    1414466 ±  5%  perf-stat.ps.dTLB-store-misses
 1.169e+09 ±  4%     -20.8%  9.255e+08 ±  6%  perf-stat.ps.dTLB-stores
 6.015e+09 ±  4%     -22.7%  4.651e+09 ±  5%  perf-stat.ps.instructions
 4.099e+11 ±  2%     -21.3%  3.225e+11 ±  4%  perf-stat.total.instructions



***************************************************************************************************
lkp-cfl-e1: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/Random Write/64MB/4/debian-x86_64-phoronix/lkp-cfl-e1/tiobench-1.3.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      2.40            -0.5        1.90 ±  5%  mpstat.cpu.all.sys%
    207080           +51.1%     312861        phoronix-test-suite.tiobench.RandomWrite.64MB.32.mb_s
     15.34 ±126%     -14.1        1.28 ±223%  perf-profile.calltrace.cycles-pp.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
     15.34 ±126%     -14.1        1.28 ±223%  perf-profile.children.cycles-pp.wp_page_copy
     79033 ±  4%     +15.2%      91012 ± 10%  turbostat.C8
     12.50 ± 29%     -45.3%       6.83 ± 44%  turbostat.C9
    194443 ± 87%     -79.8%      39236 ±  8%  sched_debug.cfs_rq:/.load.stddev
     12884 ±  8%     +34.7%      17354 ± 28%  sched_debug.cfs_rq:/.min_vruntime.max
      2307 ± 10%     +32.4%       3055 ± 11%  sched_debug.cfs_rq:/.min_vruntime.stddev
    518.00 ± 15%     +21.4%     628.86 ±  7%  sched_debug.cfs_rq:/.util_avg.avg
    279473 ± 12%     -31.4%     191767 ± 11%  sched_debug.cpu.avg_idle.stddev
      9.64 ±  4%      -1.4        8.21 ± 11%  perf-stat.i.cache-miss-rate%
  14015078 ±  2%     -21.6%   10993421 ±  4%  perf-stat.i.cache-misses
 5.125e+09 ±  2%      -8.3%  4.699e+09 ±  2%  perf-stat.i.cpu-cycles
 4.042e+08 ±  2%      -6.3%  3.788e+08 ±  2%  perf-stat.i.dTLB-stores
   1294820 ±  2%     +12.8%    1460653 ± 12%  perf-stat.i.iTLB-load-misses
   1172786 ±  2%     +16.2%    1362610 ± 15%  perf-stat.i.iTLB-loads
      0.32 ±  2%      -8.3%       0.29 ±  2%  perf-stat.i.metric.GHz
      0.00 ± 31%      +0.0        0.01 ± 15%  perf-stat.i.node-load-miss-rate%
      6.62 ± 16%    +241.5%      22.62 ± 15%  perf-stat.i.node-load-misses
      5.72 ± 18%    +313.5%      23.65 ± 16%  perf-stat.i.node-store-misses
     12.94            -3.2        9.77 ±  8%  perf-stat.overall.cache-miss-rate%
    365.74           +17.0%     427.97 ±  2%  perf-stat.overall.cycles-between-cache-misses
      2561           -15.3%       2169 ± 22%  perf-stat.overall.instructions-per-iTLB-miss
      0.00 ± 16%      +0.0        0.01 ± 25%  perf-stat.overall.node-load-miss-rate%
      0.00 ± 17%      +0.0        0.00 ± 19%  perf-stat.overall.node-store-miss-rate%
  13486705 ±  2%     -20.9%   10664323 ±  3%  perf-stat.ps.cache-misses
 4.932e+09 ±  2%      -7.5%   4.56e+09        perf-stat.ps.cpu-cycles
  3.89e+08 ±  2%      -5.5%  3.675e+08 ±  2%  perf-stat.ps.dTLB-stores
   1245754 ±  2%     +14.0%    1419784 ± 14%  perf-stat.ps.iTLB-load-misses
   1128129 ±  2%     +17.4%    1324815 ± 17%  perf-stat.ps.iTLB-loads
      6.37 ± 16%    +244.6%      21.95 ± 15%  perf-stat.ps.node-load-misses
      5.50 ± 18%    +317.3%      22.96 ± 16%  perf-stat.ps.node-store-misses



***************************************************************************************************
lkp-cfl-e1: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/Write/64MB/4/debian-x86_64-phoronix/lkp-cfl-e1/tiobench-1.3.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      2.29 ±  2%      -0.6        1.73 ±  2%  mpstat.cpu.all.sys%
     36.00 ±  2%     -30.6%      25.00 ±  2%  phoronix-test-suite.time.percent_of_cpu_this_job_got
     13930           +58.5%      22083        phoronix-test-suite.tiobench.Write.64MB.32.mb_s
  15352488 ±  3%     -19.6%   12345834        perf-stat.i.cache-misses
  67514100 ±  2%      -6.2%   63302150        perf-stat.i.cache-references
 3.915e+09 ±  3%     -11.7%  3.455e+09 ±  3%  perf-stat.i.cpu-cycles
 2.867e+08 ±  4%      -9.7%  2.588e+08 ±  3%  perf-stat.i.dTLB-stores
      0.46 ±  2%      +7.4%       0.49 ±  3%  perf-stat.i.ipc
      0.24 ±  3%     -11.8%       0.22 ±  3%  perf-stat.i.metric.GHz
      0.00 ± 15%      +0.0        0.01 ± 16%  perf-stat.i.node-load-miss-rate%
      7.27 ± 16%    +232.0%      24.13 ± 17%  perf-stat.i.node-load-misses
    282123 ±  5%     +16.2%     327849 ±  4%  perf-stat.i.node-loads
      8.60 ± 23%    +195.5%      25.42 ± 17%  perf-stat.i.node-store-misses
   6748445 ±  3%     -20.7%    5351270 ±  4%  perf-stat.i.node-stores
     25.14 ±  3%      -6.4%      23.54 ±  3%  perf-stat.overall.MPKI
     22.72            -3.2       19.49        perf-stat.overall.cache-miss-rate%
      1.46 ±  2%     -11.9%       1.28        perf-stat.overall.cpi
    255.14            +9.7%     279.81 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.02 ±  3%      +0.0        0.03 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
      0.69 ±  2%     +13.5%       0.78        perf-stat.overall.ipc
      0.00 ± 15%      +0.0        0.01 ± 18%  perf-stat.overall.node-load-miss-rate%
      0.00 ± 24%      +0.0        0.00 ± 20%  perf-stat.overall.node-store-miss-rate%
  14688148 ±  3%     -19.5%   11824921        perf-stat.ps.cache-misses
  64639446 ±  2%      -6.2%   60656902        perf-stat.ps.cache-references
 3.747e+09 ±  3%     -11.7%  3.309e+09 ±  2%  perf-stat.ps.cpu-cycles
 2.743e+08 ±  4%      -9.6%  2.479e+08 ±  2%  perf-stat.ps.dTLB-stores
      6.95 ± 16%    +232.7%      23.11 ± 17%  perf-stat.ps.node-load-misses
    269991 ±  5%     +16.4%     314138 ±  4%  perf-stat.ps.node-loads
      8.22 ± 23%    +196.0%      24.34 ± 16%  perf-stat.ps.node-store-misses
   6457429 ±  3%     -20.6%    5127035 ±  4%  perf-stat.ps.node-stores



***************************************************************************************************
lkp-hsw-d04: 8 threads 1 sockets Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (Haswell) with 8G memory
=========================================================================================
cluster/compiler/cpufreq_governor/kconfig/nr_threads/protocol/rootfs/runtime/tbox_group/testcase:
  cs-localhost/gcc-11/performance/x86_64-rhel-8.3/25%/tcp/debian-11.1-x86_64-20220510.cgz/300s/lkp-hsw-d04/nepim

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     45715           +10.7%      50623        vmstat.system.cs
    955743            +9.6%    1047078        sched_debug.cpu.nr_switches.avg
   1531197 ±  6%     +15.1%    1762126 ±  5%  sched_debug.cpu.nr_switches.max
   5782803           +12.6%    6513418 ±  2%  turbostat.C1
      2.44 ±  2%      +0.2        2.62 ±  4%  turbostat.C1%
      0.28           -45.7%       0.15        turbostat.IPC
  50588669           +10.4%   55825973        proc-vmstat.numa_hit
  50591236           +10.4%   55832674        proc-vmstat.numa_local
     25784 ±  3%     +13.8%      29331        proc-vmstat.pgactivate
 4.029e+08           +10.3%  4.445e+08        proc-vmstat.pgalloc_normal
 4.028e+08           +10.3%  4.445e+08        proc-vmstat.pgfree
  11101771           +10.3%   12243890        nepim.tcp.avg.kbps_in
  11101986           +10.3%   12244118        nepim.tcp.avg.kbps_out
     42677           +10.4%      47099        nepim.tcp.avg.rcv_s
     42350           +10.3%      46707        nepim.tcp.avg.snd_s
      1924 ± 27%     -44.9%       1061 ± 40%  nepim.time.involuntary_context_switches
     76.00 ±  3%      -6.8%      70.80 ±  3%  nepim.time.percent_of_cpu_this_job_got
    196.47 ±  5%     -11.4%     174.00 ±  5%  nepim.time.system_time
     32.86 ±  7%     +21.7%      40.00 ±  5%  nepim.time.user_time
   3702890 ± 27%     +57.8%    5842771 ± 15%  nepim.time.voluntary_context_switches
     25.11           +59.0%      39.91        perf-stat.i.MPKI
 8.617e+08           -12.8%  7.514e+08        perf-stat.i.branch-instructions
      1.71            +0.2        1.95        perf-stat.i.branch-miss-rate%
     12.34            +3.9       16.28 ±  2%  perf-stat.i.cache-miss-rate%
  21804011 ±  2%     +10.2%   24021171        perf-stat.i.cache-misses
 1.768e+08           -16.5%  1.477e+08        perf-stat.i.cache-references
     46033           +10.7%      50946        perf-stat.i.context-switches
      1.03           +90.8%       1.96        perf-stat.i.cpi
    340.91            -9.0%     310.31        perf-stat.i.cycles-between-cache-misses
      0.10 ±  7%      +0.1        0.19 ±  7%  perf-stat.i.dTLB-load-miss-rate%
 2.379e+09 ±  2%     -37.5%  1.486e+09 ±  3%  perf-stat.i.dTLB-loads
      0.04            +0.0        0.08        perf-stat.i.dTLB-store-miss-rate%
    772508            +9.2%     843638        perf-stat.i.dTLB-store-misses
 2.009e+09           -47.8%  1.049e+09 ±  2%  perf-stat.i.dTLB-stores
   1029493            +7.2%    1104044        perf-stat.i.iTLB-loads
 7.126e+09           -46.3%  3.828e+09        perf-stat.i.instructions
      5394 ± 12%     -50.5%       2671 ± 17%  perf-stat.i.instructions-per-iTLB-miss
      0.97           -47.1%       0.51        perf-stat.i.ipc
    654.57 ±  3%     -47.7%     342.06        perf-stat.i.metric.K/sec
    680.50           -36.5%     432.13        perf-stat.i.metric.M/sec
  18110220           +24.9%   22615284        perf-stat.i.node-loads
   3374408 ±  5%     -78.6%     723088 ±  5%  perf-stat.i.node-stores
     24.81           +55.5%      38.59        perf-stat.overall.MPKI
      1.83            +0.3        2.09        perf-stat.overall.branch-miss-rate%
     12.33            +3.9       16.26 ±  2%  perf-stat.overall.cache-miss-rate%
      1.02           +87.4%       1.92        perf-stat.overall.cpi
    335.05            -8.6%     306.10        perf-stat.overall.cycles-between-cache-misses
      0.10 ±  7%      +0.1        0.18 ±  7%  perf-stat.overall.dTLB-load-miss-rate%
      0.04            +0.0        0.08        perf-stat.overall.dTLB-store-miss-rate%
      5293 ± 12%     -51.6%       2561 ± 18%  perf-stat.overall.instructions-per-iTLB-miss
      0.98           -46.6%       0.52        perf-stat.overall.ipc
 8.588e+08           -12.8%  7.489e+08        perf-stat.ps.branch-instructions
  21731682 ±  2%     +10.2%   23941402        perf-stat.ps.cache-misses
 1.762e+08           -16.5%  1.472e+08        perf-stat.ps.cache-references
     45881           +10.7%      50777        perf-stat.ps.context-switches
 2.371e+09 ±  2%     -37.5%  1.481e+09 ±  3%  perf-stat.ps.dTLB-loads
    769946            +9.2%     840836        perf-stat.ps.dTLB-store-misses
 2.002e+09           -47.8%  1.046e+09 ±  2%  perf-stat.ps.dTLB-stores
   1026078            +7.2%    1100384        perf-stat.ps.iTLB-loads
 7.102e+09           -46.3%  3.815e+09        perf-stat.ps.instructions
  18050146           +24.9%   22540180        perf-stat.ps.node-loads
   3363213 ±  5%     -78.6%     720692 ±  5%  perf-stat.ps.node-stores
 2.143e+12           -46.3%   1.15e+12        perf-stat.total.instructions
     20.63 ±  4%      -2.7       17.94 ±  4%  perf-profile.calltrace.cycles-pp.rep_movs_alternative.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
     20.76 ±  4%      -2.7       18.08 ±  4%  perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg
     20.67 ±  4%      -2.7       17.99 ±  4%  perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked
     29.55 ±  3%      -2.3       27.28 ±  4%  perf-profile.calltrace.cycles-pp.sock_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64
     29.67 ±  3%      -2.3       27.40 ±  4%  perf-profile.calltrace.cycles-pp.sock_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
     30.11 ±  3%      -2.2       27.87 ±  4%  perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     30.29 ±  3%      -2.2       28.10 ±  4%  perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read.oop_sys_run_once
      1.20 ±  3%      +0.2        1.36 ±  6%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.oop_sys_run_once
      1.09 ± 10%      +0.2        1.29 ±  5%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__select
      1.08 ± 10%      +0.2        1.29 ±  9%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.sigprocmask.main
      2.32 ± 10%      +0.4        2.71 ±  5%  perf-profile.calltrace.cycles-pp.sigprocmask.main.__libc_start_main
      2.72 ± 10%      +0.4        3.16 ±  4%  perf-profile.calltrace.cycles-pp.__libc_start_main
      2.69 ± 10%      +0.4        3.13 ±  4%  perf-profile.calltrace.cycles-pp.main.__libc_start_main
      1.70 ± 12%      +0.6        2.26 ±  4%  perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish
      6.89 ±  8%      +1.2        8.12 ±  9%  perf-profile.calltrace.cycles-pp.__select
     34.20 ±  2%      -3.7       30.50 ±  4%  perf-profile.children.cycles-pp.rep_movs_alternative
     20.77 ±  4%      -2.7       18.08 ±  4%  perf-profile.children.cycles-pp._copy_to_iter
     20.68 ±  4%      -2.7       17.99 ±  4%  perf-profile.children.cycles-pp.copyout
     22.03 ±  4%      -2.7       19.37 ±  4%  perf-profile.children.cycles-pp.skb_copy_datagram_iter
     22.00 ±  4%      -2.6       19.35 ±  4%  perf-profile.children.cycles-pp.__skb_datagram_iter
     29.17 ±  3%      -2.3       26.83 ±  4%  perf-profile.children.cycles-pp.tcp_recvmsg
     26.95 ±  3%      -2.3       24.62 ±  4%  perf-profile.children.cycles-pp.tcp_recvmsg_locked
     29.55 ±  3%      -2.3       27.29 ±  4%  perf-profile.children.cycles-pp.sock_recvmsg
     29.68 ±  3%      -2.3       27.41 ±  4%  perf-profile.children.cycles-pp.sock_read_iter
     30.14 ±  3%      -2.2       27.90 ±  4%  perf-profile.children.cycles-pp.vfs_read
     30.33 ±  3%      -2.2       28.14 ±  4%  perf-profile.children.cycles-pp.ksys_read
      0.37 ±  9%      -0.1        0.30 ±  8%  perf-profile.children.cycles-pp.tcp_queue_rcv
      0.30 ± 10%      -0.1        0.24 ±  6%  perf-profile.children.cycles-pp.tcp_try_coalesce
      0.09 ± 23%      -0.1        0.04 ± 87%  perf-profile.children.cycles-pp.alloc_pages
      0.21 ±  7%      +0.0        0.25 ±  8%  perf-profile.children.cycles-pp.tcp_current_mss
      0.01 ±200%      +0.1        0.07 ± 15%  perf-profile.children.cycles-pp.perf_rotate_context
      0.11 ± 15%      +0.1        0.17 ± 10%  perf-profile.children.cycles-pp.__fdelt_warn
      0.29 ±  4%      +0.1        0.35 ±  9%  perf-profile.children.cycles-pp.security_socket_recvmsg
      0.64 ±  5%      +0.1        0.73 ±  3%  perf-profile.children.cycles-pp.mem_cgroup_charge_skmem
      0.48 ±  7%      +0.1        0.58 ±  9%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.34 ± 16%      +0.1        0.45 ± 20%  perf-profile.children.cycles-pp.poll_freewait
      1.70 ±  2%      +0.2        1.90 ±  4%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      2.39 ± 10%      +0.4        2.79 ±  5%  perf-profile.children.cycles-pp.sigprocmask
      2.72 ± 10%      +0.4        3.16 ±  4%  perf-profile.children.cycles-pp.__libc_start_main
      2.72 ± 10%      +0.4        3.16 ±  4%  perf-profile.children.cycles-pp.main
      4.71 ±  6%      +0.7        5.38 ±  5%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      6.98 ±  8%      +1.2        8.20 ±  9%  perf-profile.children.cycles-pp.__select
     33.94 ±  2%      -3.7       30.23 ±  4%  perf-profile.self.cycles-pp.rep_movs_alternative
      0.07 ± 14%      +0.0        0.08 ±  9%  perf-profile.self.cycles-pp.tcp_data_queue
      0.06 ±  7%      +0.0        0.08 ± 11%  perf-profile.self.cycles-pp.apparmor_socket_sendmsg
      0.21 ± 15%      +0.0        0.25 ±  6%  perf-profile.self.cycles-pp.vfs_write
      0.29 ±  9%      +0.0        0.33 ±  3%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.04 ± 50%      +0.0        0.09 ± 42%  perf-profile.self.cycles-pp.datagram_poll
      0.04 ± 83%      +0.0        0.09 ± 11%  perf-profile.self.cycles-pp.__fdelt_chk@plt
      0.26 ±  7%      +0.1        0.31 ± 11%  perf-profile.self.cycles-pp.aa_sk_perm
      0.16 ± 20%      +0.1        0.22 ± 12%  perf-profile.self.cycles-pp.tcp_rcv_established
      0.14 ± 27%      +0.1        0.19 ± 13%  perf-profile.self.cycles-pp.skb_page_frag_refill
      0.05 ± 85%      +0.1        0.12 ± 14%  perf-profile.self.cycles-pp.enqueue_entity
      1.70 ±  3%      +0.2        1.89 ±  4%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.72 ±  6%      +0.5        1.23 ±  7%  perf-profile.self.cycles-pp.tcp_sendmsg_locked
      4.59 ±  6%      +0.7        5.26 ±  5%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode



***************************************************************************************************
lkp-cfl-e1: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/Write/32MB/4/debian-x86_64-phoronix/lkp-cfl-e1/tiobench-1.3.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1.57 ±  2%      -0.3        1.25 ±  2%  mpstat.cpu.all.sys%
    273236            +1.5%     277363        proc-vmstat.pgfault
     20.83           -26.4%      15.33 ±  3%  phoronix-test-suite.time.percent_of_cpu_this_job_got
     13836           +57.3%      21761        phoronix-test-suite.tiobench.Write.32MB.32.mb_s
      4.99 ± 79%      -3.9        1.04 ±223%  perf-profile.calltrace.cycles-pp.next_uptodate_page.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault
      4.99 ± 79%      -3.9        1.04 ±223%  perf-profile.children.cycles-pp.next_uptodate_page
      4.71 ± 77%      -3.1        1.62 ±145%  perf-profile.children.cycles-pp.__fput
      4.71 ± 77%      -3.1        1.62 ±145%  perf-profile.children.cycles-pp.task_work_run
      8.88 ± 48%     +12.2       21.06 ± 44%  perf-profile.children.cycles-pp.__mmput
      8.88 ± 48%     +12.2       21.06 ± 44%  perf-profile.children.cycles-pp.exit_mmap
      4.99 ± 79%      -3.9        1.04 ±223%  perf-profile.self.cycles-pp.next_uptodate_page
     12.51 ±  2%      -1.2       11.32        perf-stat.i.cache-miss-rate%
  10967720           -14.1%    9423790 ±  2%  perf-stat.i.cache-misses
  62094269            -4.0%   59580926        perf-stat.i.cache-references
 3.296e+09 ±  2%      -7.4%  3.052e+09 ±  2%  perf-stat.i.cpu-cycles
      0.47 ±  2%      +5.1%       0.49 ±  2%  perf-stat.i.ipc
      0.21 ±  2%      -7.5%       0.19 ±  2%  perf-stat.i.metric.GHz
      0.00 ± 12%      +0.0        0.01 ± 15%  perf-stat.i.node-load-miss-rate%
      6.58 ± 13%    +230.0%      21.71 ± 15%  perf-stat.i.node-load-misses
    268579 ±  3%      +7.6%     288964 ±  3%  perf-stat.i.node-loads
      7.18 ± 35%    +215.4%      22.65 ± 12%  perf-stat.i.node-store-misses
   3675951           -23.7%    2805440 ±  2%  perf-stat.i.node-stores
     24.86            -3.2%      24.06        perf-stat.overall.MPKI
     17.66            -1.9       15.81        perf-stat.overall.cache-miss-rate%
      1.32 ±  2%      -6.7%       1.23        perf-stat.overall.cpi
    300.56 ±  2%      +7.7%     323.83        perf-stat.overall.cycles-between-cache-misses
      0.03 ±  2%      +0.0        0.03 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
      0.76 ±  2%      +7.1%       0.81        perf-stat.overall.ipc
      0.00 ± 13%      +0.0        0.01 ± 13%  perf-stat.overall.node-load-miss-rate%
      0.00 ± 35%      +0.0        0.00 ± 13%  perf-stat.overall.node-store-miss-rate%
  10494435           -14.0%    9020387 ±  2%  perf-stat.ps.cache-misses
  59412983            -4.0%   57045833        perf-stat.ps.cache-references
 3.154e+09 ±  2%      -7.4%  2.921e+09 ±  2%  perf-stat.ps.cpu-cycles
      6.29 ± 13%    +230.1%      20.77 ± 15%  perf-stat.ps.node-load-misses
    256921 ±  3%      +7.7%     276598 ±  3%  perf-stat.ps.node-loads
      6.87 ± 35%    +215.5%      21.68 ± 12%  perf-stat.ps.node-store-misses
   3518377           -23.7%    2685938 ±  2%  perf-stat.ps.node-stores



***************************************************************************************************
lkp-cfl-e1: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/Random Write/64MB/8/debian-x86_64-phoronix/lkp-cfl-e1/tiobench-1.3.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    203084           +45.1%     294676 ±  2%  phoronix-test-suite.tiobench.RandomWrite.64MB.8.mb_s
   1819666 ±  8%     -19.4%    1466212 ±  9%  perf-stat.i.node-stores
    361.37 ±  3%     +13.6%     410.37 ±  2%  perf-stat.overall.cycles-between-cache-misses
   1775294 ±  9%     -19.6%    1427506 ± 10%  perf-stat.ps.node-stores
     36.61 ±  4%     -23.8       12.85 ±143%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
     34.62 ±  9%     -22.8       11.80 ±141%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
     34.08 ±  5%     -22.3       11.80 ±141%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     34.08 ±  5%     -22.3       11.80 ±141%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
     34.87 ±  7%     -22.0       12.85 ±143%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
     34.87 ±  7%     -22.0       12.85 ±143%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     34.87 ±  7%     -22.0       12.85 ±143%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     31.16 ± 14%     -19.4       11.80 ±141%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.67 ± 72%      -5.7        0.00        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      4.14 ± 72%      -4.1        0.00        perf-profile.calltrace.cycles-pp.rep_movs_alternative.copyin.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter
     12.06 ± 49%      -3.2        8.89 ±147%  perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      4.14 ± 72%      -3.1        1.04 ±223%  perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      4.14 ± 72%      -3.1        1.04 ±223%  perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.vfs_write
      7.45 ± 64%      -3.1        4.38 ±168%  perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
      7.45 ± 64%      -3.1        4.38 ±168%  perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
      7.80 ± 61%      -2.6        5.21 ±175%  perf-profile.calltrace.cycles-pp.load_elf_binary.search_binary_handler.exec_binprm.bprm_execve.do_execveat_common
     11.40 ± 23%      -2.0        9.38 ±195%  perf-profile.calltrace.cycles-pp.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11.40 ± 23%      -2.0        9.38 ±195%  perf-profile.calltrace.cycles-pp.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.98 ± 54%      -1.1        8.89 ±147%  perf-profile.calltrace.cycles-pp.asm_exc_page_fault
      9.98 ± 54%      -1.1        8.89 ±147%  perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
      9.98 ± 54%      -1.1        8.89 ±147%  perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
     10.21 ± 27%      -0.8        9.38 ±195%  perf-profile.calltrace.cycles-pp.bprm_execve.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
     10.21 ± 27%      -0.8        9.38 ±195%  perf-profile.calltrace.cycles-pp.exec_binprm.bprm_execve.do_execveat_common.__x64_sys_execve.do_syscall_64
     10.21 ± 27%      -0.8        9.38 ±195%  perf-profile.calltrace.cycles-pp.search_binary_handler.exec_binprm.bprm_execve.do_execveat_common.__x64_sys_execve
      4.14 ± 72%      -0.8        3.33 ±223%  perf-profile.calltrace.cycles-pp.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      9.46 ± 62%      -0.6        8.89 ±147%  perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      5.72 ± 87%      -0.5        5.21 ±175%  perf-profile.calltrace.cycles-pp.begin_new_exec.load_elf_binary.search_binary_handler.exec_binprm.bprm_execve
      7.60 ± 63%      +0.1        7.71 ±189%  perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap
      7.60 ± 63%      +0.1        7.71 ±189%  perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput
      7.34 ± 50%      +0.2        7.50 ±142%  perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
      7.34 ± 50%      +0.2        7.50 ±142%  perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.get_signal
      6.21 ± 70%      +1.5        7.71 ±189%  perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
      6.21 ± 70%      +1.5        7.71 ±189%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
     35.81 ±  5%     -24.0       11.80 ±141%  perf-profile.children.cycles-pp.cpuidle_idle_call
     35.81 ±  5%     -24.0       11.80 ±141%  perf-profile.children.cycles-pp.cpuidle_enter
     35.81 ±  5%     -24.0       11.80 ±141%  perf-profile.children.cycles-pp.cpuidle_enter_state
     36.61 ±  4%     -23.8       12.85 ±143%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
     36.61 ±  4%     -23.8       12.85 ±143%  perf-profile.children.cycles-pp.cpu_startup_entry
     36.61 ±  4%     -23.8       12.85 ±143%  perf-profile.children.cycles-pp.do_idle
     34.87 ±  7%     -22.0       12.85 ±143%  perf-profile.children.cycles-pp.start_secondary
     31.16 ± 14%     -19.4       11.80 ±141%  perf-profile.children.cycles-pp.intel_idle
      5.67 ± 72%      -5.7        0.00        perf-profile.children.cycles-pp.do_fault
      5.18 ± 83%      -5.2        0.00        perf-profile.children.cycles-pp.rep_movs_alternative
     12.58 ± 44%      -3.7        8.89 ±147%  perf-profile.children.cycles-pp.exc_page_fault
     12.58 ± 44%      -3.7        8.89 ±147%  perf-profile.children.cycles-pp.do_user_addr_fault
     12.06 ± 49%      -3.2        8.89 ±147%  perf-profile.children.cycles-pp.__handle_mm_fault
     12.06 ± 49%      -3.2        8.89 ±147%  perf-profile.children.cycles-pp.handle_mm_fault
      4.14 ± 72%      -3.1        1.04 ±223%  perf-profile.children.cycles-pp.copyin
      4.14 ± 72%      -3.1        1.04 ±223%  perf-profile.children.cycles-pp.copy_page_from_iter_atomic
      7.80 ± 61%      -2.6        5.21 ±175%  perf-profile.children.cycles-pp.load_elf_binary
     11.40 ± 23%      -2.0        9.38 ±195%  perf-profile.children.cycles-pp.__x64_sys_execve
     11.40 ± 23%      -2.0        9.38 ±195%  perf-profile.children.cycles-pp.do_execveat_common
     10.21 ± 27%      -0.8        9.38 ±195%  perf-profile.children.cycles-pp.bprm_execve
     10.21 ± 27%      -0.8        9.38 ±195%  perf-profile.children.cycles-pp.exec_binprm
     10.21 ± 27%      -0.8        9.38 ±195%  perf-profile.children.cycles-pp.search_binary_handler
      4.14 ± 72%      -0.8        3.33 ±223%  perf-profile.children.cycles-pp.wp_page_copy
      5.72 ± 87%      -0.5        5.21 ±175%  perf-profile.children.cycles-pp.begin_new_exec
      8.12 ± 54%      -0.4        7.71 ±189%  perf-profile.children.cycles-pp.zap_pte_range
      8.12 ± 54%      -0.4        7.71 ±189%  perf-profile.children.cycles-pp.unmap_vmas
      8.12 ± 54%      -0.4        7.71 ±189%  perf-profile.children.cycles-pp.unmap_page_range
      8.12 ± 54%      -0.4        7.71 ±189%  perf-profile.children.cycles-pp.zap_pmd_range
      4.26 ± 74%      +1.3        5.56 ±223%  perf-profile.children.cycles-pp.task_work_run
     31.16 ± 14%     -19.4       11.80 ±141%  perf-profile.self.cycles-pp.intel_idle
      4.14 ± 72%      -4.1        0.00        perf-profile.self.cycles-pp.rep_movs_alternative



***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/true/500px PutImage Square/debian-x86_64-phoronix/lkp-cfl-d2/x11perf-1.1.1/phoronix-test-suite

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     59773            -3.3%      57826        proc-vmstat.pgreuse
      9554 ± 11%     -26.3%       7038 ± 20%  sched_debug.cfs_rq:/.min_vruntime.min
    237.81 ± 14%     -15.9%     199.90 ±  2%  uptime.boot
      2378 ± 12%     -14.6%       2031 ±  2%  uptime.idle
      1992            +6.1%       2114        vmstat.io.bi
     18164            +2.3%      18590        vmstat.system.in
    150.06            -5.7%     141.56        phoronix-test-suite.time.elapsed_time
    150.06            -5.7%     141.56        phoronix-test-suite.time.elapsed_time.max
    108.51            -9.2%      98.54        phoronix-test-suite.time.system_time
      5626            +8.4%       6101        phoronix-test-suite.x11perf.500pxPutImageSquare.operations___second
      1.90 ±  2%      -0.2        1.66 ±  3%  turbostat.C1%
      2.85            +0.2        3.00        turbostat.C1E%
      0.07            +0.0        0.08        turbostat.CPUGFX%
      4.27 ±  4%     -22.1%       3.33        turbostat.Pkg%pc2
      1.72            +1.6%       1.75        turbostat.RAMWatt
     40.01 ±  3%      -3.4       36.60 ±  4%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
      1.76 ±  4%      -0.7        1.04 ± 21%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      1.71 ±  3%      -0.7        1.00 ± 21%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      1.40 ±  2%      -0.6        0.80 ± 19%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
     42.14 ±  3%      -3.6       38.57 ±  4%  perf-profile.children.cycles-pp.cpuidle_idle_call
     40.22 ±  3%      -3.5       36.72 ±  4%  perf-profile.children.cycles-pp.cpuidle_enter_state
     40.25 ±  3%      -3.5       36.76 ±  4%  perf-profile.children.cycles-pp.cpuidle_enter
      2.08 ±  3%      -0.8        1.30 ± 18%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      2.01 ±  4%      -0.8        1.24 ± 18%  perf-profile.children.cycles-pp.hrtimer_interrupt
      1.66 ±  4%      -0.6        1.01 ± 17%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      1.05 ±  7%      -0.4        0.60 ± 20%  perf-profile.children.cycles-pp.tick_sched_timer
      0.90 ±  9%      -0.4        0.54 ± 19%  perf-profile.children.cycles-pp.tick_sched_handle
      0.78 ± 11%      -0.3        0.50 ± 18%  perf-profile.children.cycles-pp.update_process_times
      0.44 ± 11%      -0.2        0.27 ± 21%  perf-profile.children.cycles-pp.scheduler_tick
      0.26 ± 18%      -0.1        0.14 ± 30%  perf-profile.children.cycles-pp.tick_irq_enter
      0.26 ± 18%      -0.1        0.14 ± 28%  perf-profile.children.cycles-pp.irq_enter_rcu
      0.14 ± 13%      -0.1        0.06 ± 54%  perf-profile.children.cycles-pp.rcu_sched_clock_irq
      0.11 ± 18%      -0.1        0.04 ± 71%  perf-profile.children.cycles-pp.update_irq_load_avg
      0.23 ± 18%      -0.1        0.16 ± 13%  perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
      0.11 ± 21%      -0.1        0.04 ± 72%  perf-profile.children.cycles-pp.rcu_pending
      0.18 ± 16%      -0.1        0.11 ± 31%  perf-profile.children.cycles-pp.update_rq_clock_task
      0.09 ± 17%      -0.1        0.03 ±101%  perf-profile.children.cycles-pp.tick_check_oneshot_broadcast_this_cpu
      0.16 ± 17%      -0.0        0.12 ± 17%  perf-profile.children.cycles-pp.perf_rotate_context
      0.10 ± 10%      -0.0        0.06 ± 53%  perf-profile.children.cycles-pp.tick_nohz_stop_idle
      0.64 ± 17%      +0.2        0.81 ±  8%  perf-profile.children.cycles-pp._raw_spin_lock
      0.34 ± 18%      -0.1        0.24 ± 32%  perf-profile.self.cycles-pp.cpuidle_enter_state
      0.11 ± 18%      -0.1        0.04 ± 71%  perf-profile.self.cycles-pp.update_irq_load_avg
      0.12 ± 33%      -0.1        0.06 ± 52%  perf-profile.self.cycles-pp.__hrtimer_run_queues
      0.08 ± 14%      -0.1        0.03 ±101%  perf-profile.self.cycles-pp.tick_check_oneshot_broadcast_this_cpu
      0.62 ± 18%      +0.2        0.80 ±  7%  perf-profile.self.cycles-pp._raw_spin_lock
    184.41           +40.9%     259.89 ±  2%  perf-stat.i.MPKI
 8.998e+08            -9.4%  8.148e+08        perf-stat.i.branch-instructions
      1.83 ±  2%      +0.1        1.94 ±  2%  perf-stat.i.branch-miss-rate%
  15153824 ±  5%      +6.2%   16088855 ±  3%  perf-stat.i.branch-misses
 1.497e+09            -7.3%  1.388e+09        perf-stat.i.cache-references
      1.29 ±  3%     +33.7%       1.73        perf-stat.i.cpi
 7.394e+09            +1.3%  7.488e+09        perf-stat.i.cpu-cycles
      8076 ±  3%      +6.3%       8588        perf-stat.i.cycles-between-cache-misses
 2.321e+09           -42.5%  1.334e+09        perf-stat.i.dTLB-loads
 1.992e+09           -50.7%  9.825e+08        perf-stat.i.dTLB-stores
 7.709e+09           -32.2%  5.227e+09        perf-stat.i.instructions
     15364 ± 14%     -28.2%      11033 ± 11%  perf-stat.i.instructions-per-iTLB-miss
      0.99           -32.0%       0.68        perf-stat.i.ipc
      2.37 ±  2%      +6.3%       2.53 ±  2%  perf-stat.i.major-faults
      0.62            +1.3%       0.62        perf-stat.i.metric.GHz
    559.06           -32.6%     376.59        perf-stat.i.metric.M/sec
      2727            +3.0%       2810        perf-stat.i.minor-faults
    176938 ±  4%      +8.6%     192161        perf-stat.i.node-stores
      2730            +3.0%       2813        perf-stat.i.page-faults
    194.17           +36.8%     265.59 ±  2%  perf-stat.overall.MPKI
      1.68 ±  4%      +0.3        1.97 ±  2%  perf-stat.overall.branch-miss-rate%
      0.17 ±  4%      +0.0        0.19 ±  4%  perf-stat.overall.cache-miss-rate%
      0.96           +49.4%       1.43        perf-stat.overall.cpi
      0.03 ±  5%      +0.0        0.05 ±  5%  perf-stat.overall.dTLB-load-miss-rate%
      0.00 ± 17%      +0.0        0.01 ± 28%  perf-stat.overall.dTLB-store-miss-rate%
     11594 ± 19%     -31.3%       7967 ± 11%  perf-stat.overall.instructions-per-iTLB-miss
      1.04           -33.0%       0.70        perf-stat.overall.ipc
 8.938e+08            -9.5%   8.09e+08        perf-stat.ps.branch-instructions
  15056160 ±  5%      +6.1%   15976861 ±  3%  perf-stat.ps.branch-misses
 1.487e+09            -7.3%  1.378e+09        perf-stat.ps.cache-references
 7.345e+09            +1.2%  7.435e+09        perf-stat.ps.cpu-cycles
 2.305e+09           -42.5%  1.325e+09        perf-stat.ps.dTLB-loads
 1.978e+09           -50.7%  9.756e+08        perf-stat.ps.dTLB-stores
 7.657e+09           -32.2%   5.19e+09        perf-stat.ps.instructions
      2.36 ±  2%      +6.2%       2.51 ±  2%  perf-stat.ps.major-faults
      2710            +3.0%       2791        perf-stat.ps.minor-faults
    175787 ±  4%      +8.6%     190826        perf-stat.ps.node-stores
      2712            +3.0%       2793        perf-stat.ps.page-faults
 1.158e+12           -36.1%  7.395e+11        perf-stat.total.instructions



***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/thread/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/pread1/will-it-scale

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     19.54            +2.8       22.30        mpstat.cpu.all.usr%
    117438 ± 69%     -66.1%      39831 ±  4%  numa-meminfo.node1.FilePages
     29350 ± 69%     -66.1%       9943 ±  4%  numa-vmstat.node1.nr_file_pages
      0.23           -43.5%       0.13        turbostat.IPC
  43809356           +14.1%   49998501        will-it-scale.52.threads
    842487           +14.1%     961509        will-it-scale.per_thread_ops
  43809356           +14.1%   49998501        will-it-scale.workload
    264681 ± 21%     -23.5%     202416 ±  2%  sched_debug.cpu.clock.avg
    264689 ± 21%     -23.5%     202424 ±  2%  sched_debug.cpu.clock.max
    264673 ± 21%     -23.5%     202407 ±  2%  sched_debug.cpu.clock.min
    260519 ± 21%     -23.0%     200690 ±  3%  sched_debug.cpu.clock_task.avg
    262320 ± 21%     -23.3%     201218 ±  2%  sched_debug.cpu.clock_task.max
    566960 ± 12%     -11.5%     501642        sched_debug.cpu.max_idle_balance_cost.max
      8195 ±120%     -98.0%     160.33 ±186%  sched_debug.cpu.max_idle_balance_cost.stddev
    264673 ± 21%     -23.5%     202407 ±  2%  sched_debug.cpu_clk
    264068 ± 21%     -23.6%     201802 ±  2%  sched_debug.ktime
      0.04           +68.4%       0.07 ±  2%  perf-stat.i.MPKI
 1.447e+10            -8.8%  1.321e+10        perf-stat.i.branch-instructions
      0.94            +0.2        1.17        perf-stat.i.branch-miss-rate%
  1.36e+08           +13.6%  1.544e+08        perf-stat.i.branch-misses
      1.22           +73.6%       2.12        perf-stat.i.cpi
      0.11            +0.1        0.24        perf-stat.i.dTLB-load-miss-rate%
  43870070           +14.1%   50047738        perf-stat.i.dTLB-load-misses
 4.026e+10           -49.3%   2.04e+10        perf-stat.i.dTLB-loads
      0.00            +0.0        0.00        perf-stat.i.dTLB-store-miss-rate%
     32603            +8.4%      35341        perf-stat.i.dTLB-store-misses
 3.274e+10           -63.9%  1.183e+10        perf-stat.i.dTLB-stores
  63088748 ±  2%     +13.6%   71686582        perf-stat.i.iTLB-load-misses
  1.19e+11           -42.4%  6.854e+10        perf-stat.i.instructions
      1900 ±  2%     -49.3%     963.97        perf-stat.i.instructions-per-iTLB-miss
      0.82           -42.4%       0.47        perf-stat.i.ipc
    841.01           -48.0%     436.92        perf-stat.i.metric.M/sec
      0.04           +68.7%       0.07        perf-stat.overall.MPKI
      0.94            +0.2        1.17        perf-stat.overall.branch-miss-rate%
      1.22           +73.7%       2.12        perf-stat.overall.cpi
      0.11            +0.1        0.24        perf-stat.overall.dTLB-load-miss-rate%
      0.00            +0.0        0.00        perf-stat.overall.dTLB-store-miss-rate%
      1888 ±  2%     -49.3%     957.20        perf-stat.overall.instructions-per-iTLB-miss
      0.82           -42.4%       0.47        perf-stat.overall.ipc
    817740           -49.6%     412119        perf-stat.overall.path-length
 1.442e+10            -8.8%  1.316e+10        perf-stat.ps.branch-instructions
 1.355e+08           +13.6%  1.539e+08        perf-stat.ps.branch-misses
  43722670           +14.1%   49880640        perf-stat.ps.dTLB-load-misses
 4.013e+10           -49.3%  2.034e+10        perf-stat.ps.dTLB-loads
     32528            +8.4%      35259        perf-stat.ps.dTLB-store-misses
 3.263e+10           -63.9%  1.179e+10        perf-stat.ps.dTLB-stores
  62833990 ±  2%     +13.6%   71389669        perf-stat.ps.iTLB-load-misses
 1.186e+11           -42.4%  6.832e+10        perf-stat.ps.instructions
 3.582e+13           -42.5%  2.061e+13        perf-stat.total.instructions
     12.31            -8.9        3.41        perf-profile.calltrace.cycles-pp.rep_movs_alternative.copyout._copy_to_iter.copy_page_to_iter.shmem_file_read_iter
     12.74            -8.8        3.94        perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.vfs_read
     13.30            -8.7        4.58        perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.vfs_read.__x64_sys_pread64
     13.59            -8.7        4.90        perf-profile.calltrace.cycles-pp.copy_page_to_iter.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
     20.90            -7.8       13.12        perf-profile.calltrace.cycles-pp.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
     25.72            -7.2       18.51        perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread64
     28.82            -6.8       22.05        perf-profile.calltrace.cycles-pp.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread64
     43.04            -5.0       38.08        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread64
     50.67            -3.9       46.78        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pread64
      0.54 ±  2%      +0.1        0.61 ±  2%  perf-profile.calltrace.cycles-pp.folio_unlock.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
      0.54            +0.1        0.62 ±  2%  perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread64
      1.13 ±  2%      +0.1        1.26        perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64
      0.98            +0.1        1.12        perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.24            +0.1        1.38 ±  2%  perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.shmem_file_read_iter.vfs_read.__x64_sys_pread64
      1.59            +0.2        1.77 ±  2%  perf-profile.calltrace.cycles-pp.touch_atime.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
      0.35 ± 70%      +0.2        0.58 ±  3%  perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.shmem_file_read_iter.vfs_read
      1.60            +0.3        1.87 ±  5%  perf-profile.calltrace.cycles-pp.__fget_light.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread64
      1.99 ±  2%      +0.3        2.27        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__libc_pread64
      2.67            +0.3        3.00        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
      0.00            +0.5        0.53 ±  2%  perf-profile.calltrace.cycles-pp.fput.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread64
      6.78            +0.9        7.70        perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_pread64
     12.88            +1.7       14.56        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread64
     14.10            +1.9       15.96        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pread64
     12.31            -8.9        3.44        perf-profile.children.cycles-pp.rep_movs_alternative
     13.04            -8.8        4.28        perf-profile.children.cycles-pp.copyout
     13.32            -8.7        4.61        perf-profile.children.cycles-pp._copy_to_iter
     13.61            -8.7        4.92        perf-profile.children.cycles-pp.copy_page_to_iter
     21.00            -7.8       13.23        perf-profile.children.cycles-pp.shmem_file_read_iter
     25.82            -7.2       18.62        perf-profile.children.cycles-pp.vfs_read
     28.83            -6.8       22.07        perf-profile.children.cycles-pp.__x64_sys_pread64
     43.17            -4.9       38.24        perf-profile.children.cycles-pp.do_syscall_64
     50.98            -3.9       47.12        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.11 ±  3%      +0.0        0.13 ±  3%  perf-profile.children.cycles-pp.folio_mark_accessed
      0.11            +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.__pthread_enable_asynccancel
      0.09 ±  4%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.rw_verify_area
      0.15 ±  3%      +0.0        0.18 ±  2%  perf-profile.children.cycles-pp.__cond_resched
      0.34 ±  2%      +0.0        0.38 ±  2%  perf-profile.children.cycles-pp.folio_test_hugetlb
      0.24 ± 12%      +0.0        0.29 ±  2%  perf-profile.children.cycles-pp.aa_file_perm
      0.17 ± 10%      +0.1        0.23 ± 21%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.47            +0.1        0.53 ±  2%  perf-profile.children.cycles-pp.fput
      0.54 ±  2%      +0.1        0.61 ±  2%  perf-profile.children.cycles-pp.folio_unlock
      0.57            +0.1        0.65 ±  2%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.58 ±  2%      +0.1        0.66 ±  4%  perf-profile.children.cycles-pp.current_time
      1.16 ±  2%      +0.1        1.29        perf-profile.children.cycles-pp.filemap_get_entry
      0.99            +0.1        1.13        perf-profile.children.cycles-pp.__fsnotify_parent
      1.03            +0.1        1.18        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.28            +0.2        1.44 ±  2%  perf-profile.children.cycles-pp.atime_needs_update
      1.24 ±  2%      +0.2        1.41 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      1.61            +0.2        1.80 ±  2%  perf-profile.children.cycles-pp.touch_atime
      1.60            +0.3        1.87 ±  5%  perf-profile.children.cycles-pp.__fget_light
      2.70            +0.3        3.04        perf-profile.children.cycles-pp.shmem_get_folio_gfp
      6.68            +0.9        7.58        perf-profile.children.cycles-pp.__entry_text_start
     12.97            +1.7       14.63        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     14.24            +1.9       16.11        perf-profile.children.cycles-pp.syscall_return_via_sysret
     12.09            -8.9        3.23        perf-profile.self.cycles-pp.rep_movs_alternative
      0.11            +0.0        0.12 ±  3%  perf-profile.self.cycles-pp.__pthread_enable_asynccancel
      0.09 ±  4%      +0.0        0.10 ±  3%  perf-profile.self.cycles-pp.rw_verify_area
      0.14 ±  2%      +0.0        0.16 ±  4%  perf-profile.self.cycles-pp.testcase
      0.36 ±  4%      +0.0        0.39 ±  2%  perf-profile.self.cycles-pp.touch_atime
      0.34 ±  2%      +0.0        0.38 ±  2%  perf-profile.self.cycles-pp.folio_test_hugetlb
      0.22 ± 14%      +0.0        0.26 ±  3%  perf-profile.self.cycles-pp.aa_file_perm
      0.30 ±  4%      +0.0        0.35 ±  2%  perf-profile.self.cycles-pp._copy_to_iter
      0.46 ±  2%      +0.0        0.52        perf-profile.self.cycles-pp.current_time
      0.47            +0.1        0.52        perf-profile.self.cycles-pp.fput
      0.57 ±  3%      +0.1        0.63 ±  2%  perf-profile.self.cycles-pp.do_syscall_64
      0.48            +0.1        0.55 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.50 ±  2%      +0.1        0.57 ±  2%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.58 ±  2%      +0.1        0.65 ±  2%  perf-profile.self.cycles-pp.atime_needs_update
      0.54 ±  2%      +0.1        0.61 ±  2%  perf-profile.self.cycles-pp.folio_unlock
      0.13 ± 12%      +0.1        0.20 ± 24%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.91            +0.1        1.01        perf-profile.self.cycles-pp.__x64_sys_pread64
      0.95 ±  2%      +0.1        1.06        perf-profile.self.cycles-pp.filemap_get_entry
      0.90            +0.1        1.03        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.89            +0.1        1.02 ±  2%  perf-profile.self.cycles-pp.copyout
      0.94            +0.1        1.07        perf-profile.self.cycles-pp.__fsnotify_parent
      1.24            +0.2        1.42        perf-profile.self.cycles-pp.__libc_pread64
      1.37            +0.2        1.56        perf-profile.self.cycles-pp.shmem_get_folio_gfp
      1.76            +0.2        2.00 ±  2%  perf-profile.self.cycles-pp.vfs_read
      2.13 ±  2%      +0.3        2.39        perf-profile.self.cycles-pp.shmem_file_read_iter
      1.59            +0.3        1.86 ±  5%  perf-profile.self.cycles-pp.__fget_light
      5.81            +0.8        6.60        perf-profile.self.cycles-pp.__entry_text_start
      8.05            +1.1        9.16        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
     12.47            +1.6       14.06        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
     14.21            +1.9       16.08        perf-profile.self.cycles-pp.syscall_return_via_sysret



***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/thread/100%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/readseek1/will-it-scale

commit: 
  0d85b27b0c ("Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6")
  47ee3f1dd9 ("x86: re-introduce support for ERMS copies for user space accesses")

0d85b27b0cc6b5cf 47ee3f1dd93bcbe031539b1ecda 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.01 ±  3%      +0.0        0.01 ±  4%  mpstat.cpu.all.soft%
     53591            -2.8%      52083        proc-vmstat.pgactivate
      0.10           -40.0%       0.06        turbostat.IPC
      5.12 ± 38%   +1410.7%      77.33 ±138%  sched_debug.cfs_rq:/.removed.load_avg.avg
     28.32 ± 18%   +2025.5%     601.98 ±136%  sched_debug.cfs_rq:/.removed.load_avg.stddev
  30834962            +7.8%   33231698        will-it-scale.104.threads
    296489            +7.8%     319535        will-it-scale.per_thread_ops
  30834962            +7.8%   33231698        will-it-scale.workload
      0.04 ±  5%    +295.1%       0.17 ±129%  perf-stat.i.MPKI
 1.331e+10            -8.9%  1.212e+10        perf-stat.i.branch-instructions
      1.38            +0.3        1.71        perf-stat.i.branch-miss-rate%
 1.834e+08           +12.5%  2.063e+08        perf-stat.i.branch-misses
      2.90           +59.9%       4.64        perf-stat.i.cpi
    165.59            +2.2%     169.19        perf-stat.i.cpu-migrations
      0.19            +0.2        0.37        perf-stat.i.dTLB-load-miss-rate%
  61714012            +7.5%   66344685        perf-stat.i.dTLB-load-misses
  3.26e+10           -44.4%  1.812e+10        perf-stat.i.dTLB-loads
      0.00            +0.0        0.00 ± 50%  perf-stat.i.dTLB-store-miss-rate%
     60456            +2.3%      61857        perf-stat.i.dTLB-store-misses
 2.576e+10           -58.1%  1.079e+10        perf-stat.i.dTLB-stores
  92796911 ±  4%     +14.6%  1.063e+08        perf-stat.i.iTLB-load-misses
  79032012 ±  3%     +16.7%   92191583 ±  2%  perf-stat.i.iTLB-loads
 9.913e+10           -37.5%  6.199e+10        perf-stat.i.instructions
      1073 ±  4%     -45.4%     585.88        perf-stat.i.instructions-per-iTLB-miss
      0.34           -37.4%       0.22        perf-stat.i.ipc
    784.59 ±  4%      +9.7%     860.49 ±  4%  perf-stat.i.metric.K/sec
    689.14           -42.7%     394.59        perf-stat.i.metric.M/sec
     10907 ±  6%      -8.6%       9974 ±  6%  perf-stat.i.node-stores
      0.04 ±  5%     +68.8%       0.07 ±  7%  perf-stat.overall.MPKI
      1.38            +0.3        1.70        perf-stat.overall.branch-miss-rate%
      2.90           +59.8%       4.64        perf-stat.overall.cpi
      0.19            +0.2        0.36        perf-stat.overall.dTLB-load-miss-rate%
      0.00            +0.0        0.00        perf-stat.overall.dTLB-store-miss-rate%
      1070 ±  4%     -45.5%     583.23        perf-stat.overall.instructions-per-iTLB-miss
      0.34           -37.4%       0.22        perf-stat.overall.ipc
    967334           -41.9%     562134        perf-stat.overall.path-length
 1.326e+10            -8.9%  1.208e+10        perf-stat.ps.branch-instructions
 1.828e+08           +12.5%  2.056e+08        perf-stat.ps.branch-misses
    165.01            +2.1%     168.48        perf-stat.ps.cpu-migrations
  61509195            +7.5%   66124018        perf-stat.ps.dTLB-load-misses
 3.249e+10           -44.4%  1.806e+10        perf-stat.ps.dTLB-loads
     60319            +2.3%      61694        perf-stat.ps.dTLB-store-misses
 2.568e+10           -58.1%  1.075e+10        perf-stat.ps.dTLB-stores
  92483996 ±  4%     +14.6%  1.059e+08        perf-stat.ps.iTLB-load-misses
  78775497 ±  3%     +16.6%   91888236 ±  2%  perf-stat.ps.iTLB-loads
  9.88e+10           -37.5%  6.179e+10        perf-stat.ps.instructions
 2.983e+13           -37.4%  1.868e+13        perf-stat.total.instructions
     32.98            -7.8       25.20        perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
     43.20            -7.4       35.79        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
     26.83            -6.8       20.03        perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
     48.41            -6.1       42.34        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read
      9.66            -5.8        3.90 ±  2%  perf-profile.calltrace.cycles-pp.rep_movs_alternative.copyout._copy_to_iter.copy_page_to_iter.shmem_file_read_iter
     20.63            -5.5       15.14        perf-profile.calltrace.cycles-pp.shmem_file_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
     10.11            -5.1        4.96 ±  5%  perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.vfs_read
     10.70            -5.0        5.69 ±  2%  perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.vfs_read.ksys_read
     11.03            -5.0        6.07 ±  2%  perf-profile.calltrace.cycles-pp.copy_page_to_iter.shmem_file_read_iter.vfs_read.ksys_read.do_syscall_64
     64.33            -4.1       60.23        perf-profile.calltrace.cycles-pp.__libc_read
      3.50            -0.6        2.92 ±  2%  perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.ksys_read.do_syscall_64
      1.24 ±  4%      -0.6        0.67 ±  3%  perf-profile.calltrace.cycles-pp.mutex_unlock.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
      1.34 ±  2%      -0.5        0.79 ±  3%  perf-profile.calltrace.cycles-pp.fput.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
      1.30 ±  2%      -0.5        0.84        perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.19 ±  2%      -0.3        0.86        perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.ksys_read
     12.58            -0.3       12.25        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_lseek64
      1.60 ±  4%      -0.2        1.44        perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.41 ±  4%      -0.1        1.26        perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
      2.35            +0.1        2.42        perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.88            +0.1        3.01        perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
      4.08            +0.3        4.34        perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
      9.05            +0.3        9.32        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
      0.26 ±100%      +0.3        0.56 ±  2%  perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.shmem_file_read_iter.vfs_read
      3.28            +0.3        3.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__libc_read
      7.39            +0.7        8.06        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
      3.86            +0.8        4.68        perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_lseek64
      3.09            +0.9        4.03        perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read
     12.47            +1.0       13.48        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
      9.73            +1.3       11.00        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read
     17.06            +1.4       18.50        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_lseek64
      2.23            +4.1        6.30        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__libc_lseek64
     35.19            +4.1       39.27        perf-profile.calltrace.cycles-pp.__libc_lseek64
     33.06            -7.8       25.28        perf-profile.children.cycles-pp.ksys_read
     26.95            -6.8       20.14        perf-profile.children.cycles-pp.vfs_read
     55.83            -6.4       49.44        perf-profile.children.cycles-pp.do_syscall_64
     20.76            -5.5       15.26        perf-profile.children.cycles-pp.shmem_file_read_iter
      9.68            -5.3        4.36 ±  2%  perf-profile.children.cycles-pp.rep_movs_alternative
     10.30            -5.2        5.09 ±  2%  perf-profile.children.cycles-pp.copyout
     10.71            -5.0        5.72 ±  2%  perf-profile.children.cycles-pp._copy_to_iter
     11.05            -5.0        6.10 ±  2%  perf-profile.children.cycles-pp.copy_page_to_iter
     65.89            -4.6       61.25        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     64.93            -4.1       60.84        perf-profile.children.cycles-pp.__libc_read
      1.02 ±  5%      -0.6        0.42 ±  3%  perf-profile.children.cycles-pp.fsnotify_perm
      3.54            -0.6        2.95 ±  2%  perf-profile.children.cycles-pp.shmem_get_folio_gfp
      1.77 ±  3%      -0.5        1.25 ±  3%  perf-profile.children.cycles-pp.fput
      1.82 ±  4%      -0.5        1.32 ±  4%  perf-profile.children.cycles-pp.mutex_unlock
      1.34            -0.5        0.87        perf-profile.children.cycles-pp.__fsnotify_parent
      1.22 ±  2%      -0.3        0.88        perf-profile.children.cycles-pp.filemap_get_entry
      1.62 ±  3%      -0.2        1.45        perf-profile.children.cycles-pp.security_file_permission
      1.42 ±  4%      -0.1        1.28        perf-profile.children.cycles-pp.apparmor_file_permission
      0.24 ±  2%      -0.1        0.16 ±  4%  perf-profile.children.cycles-pp.aa_file_perm
      0.22 ± 15%      -0.1        0.16 ± 13%  perf-profile.children.cycles-pp.make_vfsuid
      0.26            -0.1        0.21 ±  3%  perf-profile.children.cycles-pp.xas_load
      0.16 ±  3%      -0.0        0.12 ±  4%  perf-profile.children.cycles-pp.xas_start
      0.32            -0.0        0.30        perf-profile.children.cycles-pp.__cond_resched
      0.19            +0.0        0.20        perf-profile.children.cycles-pp.folio_test_hugetlb
      0.28 ±  2%      +0.0        0.30        perf-profile.children.cycles-pp.folio_unlock
      0.09 ±  6%      +0.0        0.11 ±  5%  perf-profile.children.cycles-pp.make_vfsgid
      0.29            +0.0        0.31 ±  2%  perf-profile.children.cycles-pp.testcase
      0.19 ±  3%      +0.0        0.21 ±  6%  perf-profile.children.cycles-pp.generic_file_llseek_size
      0.08 ±  4%      +0.0        0.11 ±  4%  perf-profile.children.cycles-pp.read@plt
      0.41            +0.0        0.45 ±  2%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      0.54 ±  3%      +0.1        0.59 ±  2%  perf-profile.children.cycles-pp.current_time
      1.08            +0.1        1.17        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.64 ±  2%      +0.2        0.80        perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      4.14            +0.3        4.40        perf-profile.children.cycles-pp.ksys_lseek
     16.52            +1.0       17.47        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     22.43            +1.0       23.39        perf-profile.children.cycles-pp.syscall_return_via_sysret
      3.03            +2.3        5.31        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      8.91            +3.3       12.23        perf-profile.children.cycles-pp.__entry_text_start
     35.78            +4.2       39.95        perf-profile.children.cycles-pp.__libc_lseek64
      9.53            -5.3        4.24 ±  2%  perf-profile.self.cycles-pp.rep_movs_alternative
      1.01 ±  5%      -0.6        0.42 ±  3%  perf-profile.self.cycles-pp.fsnotify_perm
      1.73 ±  3%      -0.5        1.22 ±  2%  perf-profile.self.cycles-pp.fput
      1.75 ±  4%      -0.5        1.25 ±  4%  perf-profile.self.cycles-pp.mutex_unlock
      1.30            -0.5        0.83 ±  2%  perf-profile.self.cycles-pp.__fsnotify_parent
      0.96 ±  2%      -0.3        0.67        perf-profile.self.cycles-pp.filemap_get_entry
      2.13 ±  2%      -0.2        1.90 ±  3%  perf-profile.self.cycles-pp.shmem_get_folio_gfp
      0.22 ±  4%      -0.1        0.14 ±  3%  perf-profile.self.cycles-pp.aa_file_perm
      0.20 ± 14%      -0.1        0.14 ± 11%  perf-profile.self.cycles-pp.make_vfsuid
      0.14 ±  2%      -0.0        0.12 ±  4%  perf-profile.self.cycles-pp.xas_start
      0.20 ±  4%      -0.0        0.18 ±  3%  perf-profile.self.cycles-pp.security_file_permission
      0.10 ±  3%      -0.0        0.08        perf-profile.self.cycles-pp.xas_load
      0.22 ±  2%      -0.0        0.20 ±  2%  perf-profile.self.cycles-pp.__cond_resched
      0.08 ±  5%      +0.0        0.10 ±  3%  perf-profile.self.cycles-pp.make_vfsgid
      0.26 ±  3%      +0.0        0.29        perf-profile.self.cycles-pp.folio_unlock
      0.19 ±  3%      +0.0        0.21 ±  4%  perf-profile.self.cycles-pp.generic_file_llseek_size
      0.33 ±  3%      +0.0        0.36        perf-profile.self.cycles-pp.ksys_lseek
      0.38            +0.0        0.41        perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      0.47            +0.0        0.51 ±  2%  perf-profile.self.cycles-pp.__fdget_pos
      0.42 ±  5%      +0.1        0.48 ±  2%  perf-profile.self.cycles-pp.current_time
      0.29 ±  3%      +0.1        0.35 ±  2%  perf-profile.self.cycles-pp.copy_page_to_iter
      1.03            +0.1        1.08        perf-profile.self.cycles-pp.__libc_read
      0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.read@plt
      0.96            +0.1        1.03        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.72            +0.1        0.82        perf-profile.self.cycles-pp.copyout
      0.70            +0.1        0.80        perf-profile.self.cycles-pp.__libc_lseek64
      0.52 ±  2%      +0.1        0.64        perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.54 ±  2%      +0.2        0.70 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.42            +0.2        0.64 ±  7%  perf-profile.self.cycles-pp._copy_to_iter
     15.71            +0.9       16.60        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
     22.41            +1.0       23.37        perf-profile.self.cycles-pp.syscall_return_via_sysret
     10.40            +1.8       12.15        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      7.90            +3.2       11.13        perf-profile.self.cycles-pp.__entry_text_start





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



View attachment "config-6.4.0-rc3-00191-g47ee3f1dd93b" of type "text/plain" (158675 bytes)

View attachment "job-script" of type "text/plain" (7681 bytes)

View attachment "job.yaml" of type "text/plain" (5208 bytes)

View attachment "reproduce" of type "text/plain" (254 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ