lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202306121557.2d17019b-oliver.sang@intel.com>
Date:   Mon, 12 Jun 2023 15:45:09 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     David Howells <dhowells@...hat.com>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
        Christian Brauner <brauner@...nel.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        David Hildenbrand <david@...hat.com>,
        John Hubbard <jhubbard@...dia.com>,
        <linux-block@...r.kernel.org>, <linux-fsdevel@...r.kernel.org>,
        <linux-afs@...ts.infradead.org>, <linux-btrfs@...r.kernel.org>,
        <ecryptfs@...r.kernel.org>, <linux-erofs@...ts.ozlabs.org>,
        <linux-ext4@...r.kernel.org>, <cluster-devel@...hat.com>,
        <linux-um@...ts.infradead.org>, <linux-mtd@...ts.infradead.org>,
        <jfs-discussion@...ts.sourceforge.net>,
        <linux-nilfs@...r.kernel.org>,
        <linux-ntfs-dev@...ts.sourceforge.net>, <ntfs3@...ts.linux.dev>,
        <ocfs2-devel@....oracle.com>,
        <linux-karma-devel@...ts.sourceforge.net>,
        <reiserfs-devel@...r.kernel.org>, <ying.huang@...el.com>,
        <feng.tang@...el.com>, <fengwei.yin@...el.com>,
        <oliver.sang@...el.com>
Subject: [linux-next:master] [splice]  2cb1e08985:
 stress-ng.sendfile.ops_per_sec 11.6% improvement



Hello,

kernel test robot noticed a 11.6% improvement of stress-ng.sendfile.ops_per_sec on:


commit: 2cb1e08985e3dc59d0a4ebf770a87e3e2410d985 ("splice: Use filemap_splice_read() instead of generic_file_splice_read()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	class: pipe
	test: sendfile
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  pipe/gcc-12/performance/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp8/sendfile/stress-ng/60s

commit: 
  ab82513126 ("cifs: Use filemap_splice_read()")
  2cb1e08985 ("splice: Use filemap_splice_read() instead of generic_file_splice_read()")

ab82513126f8b426 2cb1e08985e3dc59d0a4ebf770a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    568180            -1.5%     559667        proc-vmstat.pgalloc_normal
    348772            -1.7%     342744        proc-vmstat.pgfault
     39953           +11.7%      44609        stress-ng.sendfile.MB_per_sec_sent_to_/dev/null
  38320456           +11.6%   42768635        stress-ng.sendfile.ops
    638671           +11.6%     712807        stress-ng.sendfile.ops_per_sec
      0.18 ±  6%      -0.1        0.11 ±  8%  perf-stat.i.branch-miss-rate%
  61342100           -61.5%   23631851 ±  3%  perf-stat.i.branch-misses
      0.74            +3.7%       0.77        perf-stat.i.cpi
      0.28 ±222%      -0.3        0.00 ±  4%  perf-stat.i.dTLB-load-miss-rate%
 7.958e+11 ±223%    -100.0%     105622 ±  6%  perf-stat.i.dTLB-load-misses
 8.398e+10           -10.4%  7.528e+10        perf-stat.i.dTLB-loads
 4.702e+10           -17.3%  3.888e+10        perf-stat.i.dTLB-stores
 2.965e+11            -4.7%  2.825e+11        perf-stat.i.instructions
      1.36            -4.0%       1.31        perf-stat.i.ipc
      1632            -7.4%       1511        perf-stat.i.metric.M/sec
      0.11            -0.1        0.04 ±  3%  perf-stat.overall.branch-miss-rate%
      0.73            +4.8%       0.76        perf-stat.overall.cpi
     16.38 ±223%     -16.4        0.00 ±  6%  perf-stat.overall.dTLB-load-miss-rate%
      0.00 ±  3%      +0.0        0.00 ±  3%  perf-stat.overall.dTLB-store-miss-rate%
      1.38            -4.6%       1.32        perf-stat.overall.ipc
  60279316           -61.5%   23221084 ±  3%  perf-stat.ps.branch-misses
  7.58e+11 ±223%    -100.0%     104910 ±  6%  perf-stat.ps.dTLB-load-misses
 8.264e+10           -10.4%  7.408e+10        perf-stat.ps.dTLB-loads
 4.628e+10           -17.3%  3.826e+10        perf-stat.ps.dTLB-stores
 2.918e+11            -4.7%   2.78e+11        perf-stat.ps.instructions
 1.832e+13            -4.8%  1.745e+13        perf-stat.total.instructions
     73.32           -73.3        0.00        perf-profile.calltrace.cycles-pp.generic_file_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
     68.75           -68.7        0.00        perf-profile.calltrace.cycles-pp.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
     24.87 ±  3%     -24.9        0.00        perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct
     23.72 ±  4%     -23.7        0.00        perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.generic_file_splice_read.splice_direct_to_actor
     20.27 ±  3%     -20.3        0.00        perf-profile.calltrace.cycles-pp.copy_page_to_iter_pipe.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct
      0.58            +0.1        0.65        perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_splice_read.splice_direct_to_actor.do_splice_direct
      0.80            +0.1        0.89        perf-profile.calltrace.cycles-pp.security_file_permission.vfs_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
      1.78            +0.1        1.88        perf-profile.calltrace.cycles-pp.page_cache_pipe_buf_confirm.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor
      1.33 ±  2%      +0.1        1.47        perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
      3.07            +0.3        3.39        perf-profile.calltrace.cycles-pp.vfs_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
      0.00            +0.6        0.58        perf-profile.calltrace.cycles-pp.xas_descend.xas_load.filemap_get_read_batch.filemap_get_pages.filemap_splice_read
      0.00            +0.6        0.61        perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.filemap_splice_read.splice_direct_to_actor
      0.00            +1.3        1.30        perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_splice_read.splice_direct_to_actor.do_splice_direct
      0.00            +1.6        1.58        perf-profile.calltrace.cycles-pp.xas_load.filemap_get_read_batch.filemap_get_pages.filemap_splice_read.splice_direct_to_actor
      0.00            +1.7        1.65        perf-profile.calltrace.cycles-pp.touch_atime.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
     10.13            +1.7       11.83        perf-profile.calltrace.cycles-pp.page_cache_pipe_buf_release.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor
      0.00            +2.2        2.20        perf-profile.calltrace.cycles-pp.folio_mark_accessed.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
     22.17            +2.6       24.73        perf-profile.calltrace.cycles-pp.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile
     22.48            +2.6       25.04        perf-profile.calltrace.cycles-pp.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
     20.17            +2.8       22.99        perf-profile.calltrace.cycles-pp.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct
      0.00           +13.8       13.80        perf-profile.calltrace.cycles-pp.release_pages.__pagevec_release.filemap_splice_read.splice_direct_to_actor.do_splice_direct
      0.00           +14.4       14.44        perf-profile.calltrace.cycles-pp.__pagevec_release.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
      0.00           +18.5       18.54        perf-profile.calltrace.cycles-pp.splice_folio_into_pipe.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
      0.00           +25.9       25.92        perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_splice_read.splice_direct_to_actor.do_splice_direct
      0.00           +27.2       27.15        perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
      0.00           +69.0       69.03        perf-profile.calltrace.cycles-pp.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
     73.48           -73.5        0.00        perf-profile.children.cycles-pp.generic_file_splice_read
     69.92           -69.9        0.00        perf-profile.children.cycles-pp.filemap_read
     20.75           -20.8        0.00        perf-profile.children.cycles-pp.copy_page_to_iter_pipe
      3.04            -1.3        1.75        perf-profile.children.cycles-pp.touch_atime
      2.54            -1.1        1.47        perf-profile.children.cycles-pp.atime_needs_update
      1.20            -0.5        0.69        perf-profile.children.cycles-pp.current_time
      2.84            -0.2        2.64        perf-profile.children.cycles-pp.folio_mark_accessed
      0.34            -0.1        0.19 ±  2%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
      0.26 ±  2%      -0.1        0.15 ±  4%  perf-profile.children.cycles-pp.make_vfsgid
      0.25 ±  2%      -0.1        0.16 ±  3%  perf-profile.children.cycles-pp.make_vfsuid
      0.08            +0.0        0.09        perf-profile.children.cycles-pp.pipe_unlock
      0.26            +0.0        0.28        perf-profile.children.cycles-pp.__get_task_ioprio
      0.26            +0.0        0.29        perf-profile.children.cycles-pp.aa_file_perm
      0.30 ±  3%      +0.0        0.33        perf-profile.children.cycles-pp.fsnotify_perm
      0.18 ±  2%      +0.0        0.20 ±  2%  perf-profile.children.cycles-pp.rw_verify_area
      0.69            +0.0        0.72        perf-profile.children.cycles-pp.xas_descend
      0.42            +0.0        0.45        perf-profile.children.cycles-pp.xas_start
      0.28 ±  2%      +0.0        0.32        perf-profile.children.cycles-pp.splice_from_pipe_next
      0.29            +0.0        0.33 ±  2%  perf-profile.children.cycles-pp.rcu_all_qs
      0.68            +0.1        0.75        perf-profile.children.cycles-pp.apparmor_file_permission
      0.95            +0.1        1.04        perf-profile.children.cycles-pp.pipe_to_null
      0.95            +0.1        1.06        perf-profile.children.cycles-pp.security_file_permission
      1.76            +0.1        1.86        perf-profile.children.cycles-pp.xas_load
      0.00            +0.1        0.11        perf-profile.children.cycles-pp.mlock_drain_local
      0.76 ±  2%      +0.1        0.87        perf-profile.children.cycles-pp.__cond_resched
      2.20            +0.1        2.33        perf-profile.children.cycles-pp.page_cache_pipe_buf_confirm
      1.43 ±  2%      +0.2        1.58        perf-profile.children.cycles-pp.__fsnotify_parent
      0.00            +0.2        0.25        perf-profile.children.cycles-pp.free_unref_page_list
      0.00            +0.3        0.27 ±  3%  perf-profile.children.cycles-pp.lru_add_drain_cpu
      3.13            +0.3        3.46        perf-profile.children.cycles-pp.vfs_splice_read
      0.00            +0.5        0.47        perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list
     10.17            +1.7       11.83        perf-profile.children.cycles-pp.page_cache_pipe_buf_release
     23.91 ±  4%      +2.2       26.14        perf-profile.children.cycles-pp.filemap_get_read_batch
     24.98 ±  3%      +2.3       27.29        perf-profile.children.cycles-pp.filemap_get_pages
     21.14            +2.4       23.56        perf-profile.children.cycles-pp.__splice_from_pipe
     22.34            +2.6       24.91        perf-profile.children.cycles-pp.splice_from_pipe
     22.54            +2.6       25.11        perf-profile.children.cycles-pp.direct_splice_actor
      0.00           +14.0       13.99        perf-profile.children.cycles-pp.release_pages
      0.00           +14.6       14.62        perf-profile.children.cycles-pp.__pagevec_release
      0.00           +18.3       18.30        perf-profile.children.cycles-pp.splice_folio_into_pipe
      0.00           +70.5       70.52        perf-profile.children.cycles-pp.filemap_splice_read
     16.46           -16.5        0.00        perf-profile.self.cycles-pp.filemap_read
     16.16           -16.2        0.00        perf-profile.self.cycles-pp.copy_page_to_iter_pipe
      0.95 ±  2%      -0.4        0.54        perf-profile.self.cycles-pp.atime_needs_update
      0.86            -0.4        0.50        perf-profile.self.cycles-pp.current_time
      0.50            -0.2        0.25 ±  2%  perf-profile.self.cycles-pp.touch_atime
      2.32 ±  2%      -0.1        2.19        perf-profile.self.cycles-pp.folio_mark_accessed
      0.27            -0.1        0.15 ±  3%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
      0.20 ±  3%      -0.1        0.11 ±  4%  perf-profile.self.cycles-pp.make_vfsgid
      0.19 ±  3%      -0.1        0.12 ±  3%  perf-profile.self.cycles-pp.make_vfsuid
      0.23 ±  2%      +0.0        0.25        perf-profile.self.cycles-pp.__get_task_ioprio
      0.23 ±  2%      +0.0        0.25 ±  2%  perf-profile.self.cycles-pp.aa_file_perm
      0.27 ±  3%      +0.0        0.29 ±  2%  perf-profile.self.cycles-pp.fsnotify_perm
      0.56            +0.0        0.58        perf-profile.self.cycles-pp.xas_descend
      0.14 ±  2%      +0.0        0.16 ±  2%  perf-profile.self.cycles-pp.rw_verify_area
      0.35            +0.0        0.38 ±  2%  perf-profile.self.cycles-pp.xas_start
      0.19 ±  2%      +0.0        0.22 ±  2%  perf-profile.self.cycles-pp.rcu_all_qs
      0.33            +0.0        0.35 ±  2%  perf-profile.self.cycles-pp.splice_direct_to_actor
      0.25 ±  2%      +0.0        0.28        perf-profile.self.cycles-pp.splice_from_pipe_next
      0.37            +0.0        0.41        perf-profile.self.cycles-pp.apparmor_file_permission
      0.48            +0.0        0.52        perf-profile.self.cycles-pp.pipe_to_null
      0.72            +0.0        0.76        perf-profile.self.cycles-pp.xas_load
      0.31 ±  2%      +0.0        0.35 ±  2%  perf-profile.self.cycles-pp.security_file_permission
      0.50 ±  2%      +0.0        0.54        perf-profile.self.cycles-pp.vfs_splice_read
      0.48 ±  2%      +0.1        0.54        perf-profile.self.cycles-pp.__cond_resched
      0.00            +0.1        0.07 ±  5%  perf-profile.self.cycles-pp.mlock_drain_local
      1.75            +0.1        1.85        perf-profile.self.cycles-pp.page_cache_pipe_buf_confirm
      1.02            +0.1        1.14        perf-profile.self.cycles-pp.filemap_get_pages
      1.39 ±  2%      +0.1        1.54        perf-profile.self.cycles-pp.__fsnotify_parent
      1.10 ±  2%      +0.2        1.25        perf-profile.self.cycles-pp.splice_from_pipe
      0.00            +0.2        0.19 ±  2%  perf-profile.self.cycles-pp.free_unref_page_list
      0.00            +0.2        0.24 ±  2%  perf-profile.self.cycles-pp.lru_add_drain_cpu
      0.00            +0.3        0.32 ±  2%  perf-profile.self.cycles-pp.__pagevec_release
      0.00            +0.4        0.40        perf-profile.self.cycles-pp.__mem_cgroup_uncharge_list
      8.70            +0.6        9.31        perf-profile.self.cycles-pp.__splice_from_pipe
      9.60            +1.6       11.20        perf-profile.self.cycles-pp.page_cache_pipe_buf_release
     22.00 ±  4%      +2.1       24.10        perf-profile.self.cycles-pp.filemap_get_read_batch
      0.00            +6.5        6.50        perf-profile.self.cycles-pp.filemap_splice_read
      0.00           +13.3       13.29        perf-profile.self.cycles-pp.release_pages
      0.00           +17.5       17.53        perf-profile.self.cycles-pp.splice_folio_into_pipe




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



View attachment "config-6.4.0-rc2-00028-g2cb1e08985e3" of type "text/plain" (158731 bytes)

View attachment "job-script" of type "text/plain" (9244 bytes)

View attachment "job.yaml" of type "text/plain" (6456 bytes)

View attachment "reproduce" of type "text/plain" (341 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ