[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202306121557.2d17019b-oliver.sang@intel.com>
Date: Mon, 12 Jun 2023 15:45:09 +0800
From: kernel test robot <oliver.sang@...el.com>
To: David Howells <dhowells@...hat.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
Linux Memory Management List <linux-mm@...ck.org>,
Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
Christian Brauner <brauner@...nel.org>,
Al Viro <viro@...iv.linux.org.uk>,
David Hildenbrand <david@...hat.com>,
John Hubbard <jhubbard@...dia.com>,
<linux-block@...r.kernel.org>, <linux-fsdevel@...r.kernel.org>,
<linux-afs@...ts.infradead.org>, <linux-btrfs@...r.kernel.org>,
<ecryptfs@...r.kernel.org>, <linux-erofs@...ts.ozlabs.org>,
<linux-ext4@...r.kernel.org>, <cluster-devel@...hat.com>,
<linux-um@...ts.infradead.org>, <linux-mtd@...ts.infradead.org>,
<jfs-discussion@...ts.sourceforge.net>,
<linux-nilfs@...r.kernel.org>,
<linux-ntfs-dev@...ts.sourceforge.net>, <ntfs3@...ts.linux.dev>,
<ocfs2-devel@....oracle.com>,
<linux-karma-devel@...ts.sourceforge.net>,
<reiserfs-devel@...r.kernel.org>, <ying.huang@...el.com>,
<feng.tang@...el.com>, <fengwei.yin@...el.com>,
<oliver.sang@...el.com>
Subject: [linux-next:master] [splice] 2cb1e08985:
stress-ng.sendfile.ops_per_sec 11.6% improvement
Hello,
kernel test robot noticed a 11.6% improvement of stress-ng.sendfile.ops_per_sec on:
commit: 2cb1e08985e3dc59d0a4ebf770a87e3e2410d985 ("splice: Use filemap_splice_read() instead of generic_file_splice_read()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
class: pipe
test: sendfile
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
pipe/gcc-12/performance/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp8/sendfile/stress-ng/60s
commit:
ab82513126 ("cifs: Use filemap_splice_read()")
2cb1e08985 ("splice: Use filemap_splice_read() instead of generic_file_splice_read()")
ab82513126f8b426 2cb1e08985e3dc59d0a4ebf770a
---------------- ---------------------------
%stddev %change %stddev
\ | \
568180 -1.5% 559667 proc-vmstat.pgalloc_normal
348772 -1.7% 342744 proc-vmstat.pgfault
39953 +11.7% 44609 stress-ng.sendfile.MB_per_sec_sent_to_/dev/null
38320456 +11.6% 42768635 stress-ng.sendfile.ops
638671 +11.6% 712807 stress-ng.sendfile.ops_per_sec
0.18 ± 6% -0.1 0.11 ± 8% perf-stat.i.branch-miss-rate%
61342100 -61.5% 23631851 ± 3% perf-stat.i.branch-misses
0.74 +3.7% 0.77 perf-stat.i.cpi
0.28 ±222% -0.3 0.00 ± 4% perf-stat.i.dTLB-load-miss-rate%
7.958e+11 ±223% -100.0% 105622 ± 6% perf-stat.i.dTLB-load-misses
8.398e+10 -10.4% 7.528e+10 perf-stat.i.dTLB-loads
4.702e+10 -17.3% 3.888e+10 perf-stat.i.dTLB-stores
2.965e+11 -4.7% 2.825e+11 perf-stat.i.instructions
1.36 -4.0% 1.31 perf-stat.i.ipc
1632 -7.4% 1511 perf-stat.i.metric.M/sec
0.11 -0.1 0.04 ± 3% perf-stat.overall.branch-miss-rate%
0.73 +4.8% 0.76 perf-stat.overall.cpi
16.38 ±223% -16.4 0.00 ± 6% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 3% +0.0 0.00 ± 3% perf-stat.overall.dTLB-store-miss-rate%
1.38 -4.6% 1.32 perf-stat.overall.ipc
60279316 -61.5% 23221084 ± 3% perf-stat.ps.branch-misses
7.58e+11 ±223% -100.0% 104910 ± 6% perf-stat.ps.dTLB-load-misses
8.264e+10 -10.4% 7.408e+10 perf-stat.ps.dTLB-loads
4.628e+10 -17.3% 3.826e+10 perf-stat.ps.dTLB-stores
2.918e+11 -4.7% 2.78e+11 perf-stat.ps.instructions
1.832e+13 -4.8% 1.745e+13 perf-stat.total.instructions
73.32 -73.3 0.00 perf-profile.calltrace.cycles-pp.generic_file_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
68.75 -68.7 0.00 perf-profile.calltrace.cycles-pp.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
24.87 ± 3% -24.9 0.00 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct
23.72 ± 4% -23.7 0.00 perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.generic_file_splice_read.splice_direct_to_actor
20.27 ± 3% -20.3 0.00 perf-profile.calltrace.cycles-pp.copy_page_to_iter_pipe.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct
0.58 +0.1 0.65 perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_splice_read.splice_direct_to_actor.do_splice_direct
0.80 +0.1 0.89 perf-profile.calltrace.cycles-pp.security_file_permission.vfs_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
1.78 +0.1 1.88 perf-profile.calltrace.cycles-pp.page_cache_pipe_buf_confirm.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor
1.33 ± 2% +0.1 1.47 perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
3.07 +0.3 3.39 perf-profile.calltrace.cycles-pp.vfs_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
0.00 +0.6 0.58 perf-profile.calltrace.cycles-pp.xas_descend.xas_load.filemap_get_read_batch.filemap_get_pages.filemap_splice_read
0.00 +0.6 0.61 perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.filemap_splice_read.splice_direct_to_actor
0.00 +1.3 1.30 perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_splice_read.splice_direct_to_actor.do_splice_direct
0.00 +1.6 1.58 perf-profile.calltrace.cycles-pp.xas_load.filemap_get_read_batch.filemap_get_pages.filemap_splice_read.splice_direct_to_actor
0.00 +1.7 1.65 perf-profile.calltrace.cycles-pp.touch_atime.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
10.13 +1.7 11.83 perf-profile.calltrace.cycles-pp.page_cache_pipe_buf_release.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor
0.00 +2.2 2.20 perf-profile.calltrace.cycles-pp.folio_mark_accessed.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
22.17 +2.6 24.73 perf-profile.calltrace.cycles-pp.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile
22.48 +2.6 25.04 perf-profile.calltrace.cycles-pp.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
20.17 +2.8 22.99 perf-profile.calltrace.cycles-pp.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct
0.00 +13.8 13.80 perf-profile.calltrace.cycles-pp.release_pages.__pagevec_release.filemap_splice_read.splice_direct_to_actor.do_splice_direct
0.00 +14.4 14.44 perf-profile.calltrace.cycles-pp.__pagevec_release.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
0.00 +18.5 18.54 perf-profile.calltrace.cycles-pp.splice_folio_into_pipe.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
0.00 +25.9 25.92 perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_splice_read.splice_direct_to_actor.do_splice_direct
0.00 +27.2 27.15 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile
0.00 +69.0 69.03 perf-profile.calltrace.cycles-pp.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
73.48 -73.5 0.00 perf-profile.children.cycles-pp.generic_file_splice_read
69.92 -69.9 0.00 perf-profile.children.cycles-pp.filemap_read
20.75 -20.8 0.00 perf-profile.children.cycles-pp.copy_page_to_iter_pipe
3.04 -1.3 1.75 perf-profile.children.cycles-pp.touch_atime
2.54 -1.1 1.47 perf-profile.children.cycles-pp.atime_needs_update
1.20 -0.5 0.69 perf-profile.children.cycles-pp.current_time
2.84 -0.2 2.64 perf-profile.children.cycles-pp.folio_mark_accessed
0.34 -0.1 0.19 ± 2% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
0.26 ± 2% -0.1 0.15 ± 4% perf-profile.children.cycles-pp.make_vfsgid
0.25 ± 2% -0.1 0.16 ± 3% perf-profile.children.cycles-pp.make_vfsuid
0.08 +0.0 0.09 perf-profile.children.cycles-pp.pipe_unlock
0.26 +0.0 0.28 perf-profile.children.cycles-pp.__get_task_ioprio
0.26 +0.0 0.29 perf-profile.children.cycles-pp.aa_file_perm
0.30 ± 3% +0.0 0.33 perf-profile.children.cycles-pp.fsnotify_perm
0.18 ± 2% +0.0 0.20 ± 2% perf-profile.children.cycles-pp.rw_verify_area
0.69 +0.0 0.72 perf-profile.children.cycles-pp.xas_descend
0.42 +0.0 0.45 perf-profile.children.cycles-pp.xas_start
0.28 ± 2% +0.0 0.32 perf-profile.children.cycles-pp.splice_from_pipe_next
0.29 +0.0 0.33 ± 2% perf-profile.children.cycles-pp.rcu_all_qs
0.68 +0.1 0.75 perf-profile.children.cycles-pp.apparmor_file_permission
0.95 +0.1 1.04 perf-profile.children.cycles-pp.pipe_to_null
0.95 +0.1 1.06 perf-profile.children.cycles-pp.security_file_permission
1.76 +0.1 1.86 perf-profile.children.cycles-pp.xas_load
0.00 +0.1 0.11 perf-profile.children.cycles-pp.mlock_drain_local
0.76 ± 2% +0.1 0.87 perf-profile.children.cycles-pp.__cond_resched
2.20 +0.1 2.33 perf-profile.children.cycles-pp.page_cache_pipe_buf_confirm
1.43 ± 2% +0.2 1.58 perf-profile.children.cycles-pp.__fsnotify_parent
0.00 +0.2 0.25 perf-profile.children.cycles-pp.free_unref_page_list
0.00 +0.3 0.27 ± 3% perf-profile.children.cycles-pp.lru_add_drain_cpu
3.13 +0.3 3.46 perf-profile.children.cycles-pp.vfs_splice_read
0.00 +0.5 0.47 perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list
10.17 +1.7 11.83 perf-profile.children.cycles-pp.page_cache_pipe_buf_release
23.91 ± 4% +2.2 26.14 perf-profile.children.cycles-pp.filemap_get_read_batch
24.98 ± 3% +2.3 27.29 perf-profile.children.cycles-pp.filemap_get_pages
21.14 +2.4 23.56 perf-profile.children.cycles-pp.__splice_from_pipe
22.34 +2.6 24.91 perf-profile.children.cycles-pp.splice_from_pipe
22.54 +2.6 25.11 perf-profile.children.cycles-pp.direct_splice_actor
0.00 +14.0 13.99 perf-profile.children.cycles-pp.release_pages
0.00 +14.6 14.62 perf-profile.children.cycles-pp.__pagevec_release
0.00 +18.3 18.30 perf-profile.children.cycles-pp.splice_folio_into_pipe
0.00 +70.5 70.52 perf-profile.children.cycles-pp.filemap_splice_read
16.46 -16.5 0.00 perf-profile.self.cycles-pp.filemap_read
16.16 -16.2 0.00 perf-profile.self.cycles-pp.copy_page_to_iter_pipe
0.95 ± 2% -0.4 0.54 perf-profile.self.cycles-pp.atime_needs_update
0.86 -0.4 0.50 perf-profile.self.cycles-pp.current_time
0.50 -0.2 0.25 ± 2% perf-profile.self.cycles-pp.touch_atime
2.32 ± 2% -0.1 2.19 perf-profile.self.cycles-pp.folio_mark_accessed
0.27 -0.1 0.15 ± 3% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
0.20 ± 3% -0.1 0.11 ± 4% perf-profile.self.cycles-pp.make_vfsgid
0.19 ± 3% -0.1 0.12 ± 3% perf-profile.self.cycles-pp.make_vfsuid
0.23 ± 2% +0.0 0.25 perf-profile.self.cycles-pp.__get_task_ioprio
0.23 ± 2% +0.0 0.25 ± 2% perf-profile.self.cycles-pp.aa_file_perm
0.27 ± 3% +0.0 0.29 ± 2% perf-profile.self.cycles-pp.fsnotify_perm
0.56 +0.0 0.58 perf-profile.self.cycles-pp.xas_descend
0.14 ± 2% +0.0 0.16 ± 2% perf-profile.self.cycles-pp.rw_verify_area
0.35 +0.0 0.38 ± 2% perf-profile.self.cycles-pp.xas_start
0.19 ± 2% +0.0 0.22 ± 2% perf-profile.self.cycles-pp.rcu_all_qs
0.33 +0.0 0.35 ± 2% perf-profile.self.cycles-pp.splice_direct_to_actor
0.25 ± 2% +0.0 0.28 perf-profile.self.cycles-pp.splice_from_pipe_next
0.37 +0.0 0.41 perf-profile.self.cycles-pp.apparmor_file_permission
0.48 +0.0 0.52 perf-profile.self.cycles-pp.pipe_to_null
0.72 +0.0 0.76 perf-profile.self.cycles-pp.xas_load
0.31 ± 2% +0.0 0.35 ± 2% perf-profile.self.cycles-pp.security_file_permission
0.50 ± 2% +0.0 0.54 perf-profile.self.cycles-pp.vfs_splice_read
0.48 ± 2% +0.1 0.54 perf-profile.self.cycles-pp.__cond_resched
0.00 +0.1 0.07 ± 5% perf-profile.self.cycles-pp.mlock_drain_local
1.75 +0.1 1.85 perf-profile.self.cycles-pp.page_cache_pipe_buf_confirm
1.02 +0.1 1.14 perf-profile.self.cycles-pp.filemap_get_pages
1.39 ± 2% +0.1 1.54 perf-profile.self.cycles-pp.__fsnotify_parent
1.10 ± 2% +0.2 1.25 perf-profile.self.cycles-pp.splice_from_pipe
0.00 +0.2 0.19 ± 2% perf-profile.self.cycles-pp.free_unref_page_list
0.00 +0.2 0.24 ± 2% perf-profile.self.cycles-pp.lru_add_drain_cpu
0.00 +0.3 0.32 ± 2% perf-profile.self.cycles-pp.__pagevec_release
0.00 +0.4 0.40 perf-profile.self.cycles-pp.__mem_cgroup_uncharge_list
8.70 +0.6 9.31 perf-profile.self.cycles-pp.__splice_from_pipe
9.60 +1.6 11.20 perf-profile.self.cycles-pp.page_cache_pipe_buf_release
22.00 ± 4% +2.1 24.10 perf-profile.self.cycles-pp.filemap_get_read_batch
0.00 +6.5 6.50 perf-profile.self.cycles-pp.filemap_splice_read
0.00 +13.3 13.29 perf-profile.self.cycles-pp.release_pages
0.00 +17.5 17.53 perf-profile.self.cycles-pp.splice_folio_into_pipe
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
View attachment "config-6.4.0-rc2-00028-g2cb1e08985e3" of type "text/plain" (158731 bytes)
View attachment "job-script" of type "text/plain" (9244 bytes)
View attachment "job.yaml" of type "text/plain" (6456 bytes)
View attachment "reproduce" of type "text/plain" (341 bytes)
Powered by blists - more mailing lists