[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202502051330.4d2f403b-lkp@intel.com>
Date: Wed, 5 Feb 2025 14:13:48 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Vadim Fedorenko <vadfed@...a.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Jakub Kicinski <kuba@...nel.org>, Willem de Bruijn <willemb@...gle.com>,
Jason Xing <kerneljasonxing@...il.com>, <netdev@...r.kernel.org>,
<linux-alpha@...r.kernel.org>, <linux-mips@...r.kernel.org>,
<linux-parisc@...r.kernel.org>, <sparclinux@...r.kernel.org>,
<linux-arch@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linus:master] [net_tstamp] 4aecca4c76:
redis.get_total_throughput_rps 1.5% improvement
Hello,
kernel test robot noticed a 1.5% improvement of redis.get_total_throughput_rps on:
commit: 4aecca4c76808f3736056d18ff510df80424bc9f ("net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: redis
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
all: 1
sc_overcommit_memory: 1
sc_somaxconn: 65535
thp_enabled: never
thp_defrag: never
cluster: cs-localhost
cpu_node_bind: even
nr_processes: 4
test: set,get
data_size: 1024
n_client: 5
requests: 68000000
n_pipeline: 3
key_len: 68000000
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250205/202502051330.4d2f403b-lkp@intel.com
=========================================================================================
all/cluster/compiler/cpu_node_bind/cpufreq_governor/data_size/kconfig/key_len/n_client/n_pipeline/nr_processes/requests/rootfs/sc_overcommit_memory/sc_somaxconn/tbox_group/test/testcase/thp_defrag/thp_enabled:
1/cs-localhost/gcc-12/even/performance/1024/x86_64-rhel-9.4/68000000/5/3/4/68000000/debian-12-x86_64-20240206.cgz/1/65535/lkp-icl-2sp7/set,get/redis/never/never
commit:
34ea1df802 ("Merge branch 'net-mlx5-hw-counters-refactor'")
4aecca4c76 ("net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message")
34ea1df802f79d44 4aecca4c76808f3736056d18ff5
---------------- ---------------------------
%stddev %change %stddev
\ | \
18491785 +2.1% 18880098 proc-vmstat.numa_hint_faults
18483590 +2.0% 18850441 proc-vmstat.numa_hint_faults_local
8589 ± 97% +255.7% 30553 ± 17% proc-vmstat.numa_pages_migrated
21039386 +2.2% 21505792 proc-vmstat.numa_pte_updates
8589 ± 97% +255.7% 30553 ± 17% proc-vmstat.pgmigrate_success
25696 ± 12% +14.4% 29397 proc-vmstat.pgreuse
252371 +1.5% 256108 redis.get_avg_throughput_rps
67.36 -1.5% 66.38 redis.get_avg_time_sec
1009486 +1.5% 1024432 redis.get_total_throughput_rps
269.45 -1.5% 265.52 redis.get_total_time_sec
257.67 -1.1% 254.83 redis.time.percent_of_cpu_this_job_got
337.27 -2.4% 329.05 redis.time.system_time
3.957e+09 +1.3% 4.008e+09 perf-stat.i.branch-instructions
38469227 +1.6% 39070923 perf-stat.i.branch-misses
32.20 +0.8 33.01 perf-stat.i.cache-miss-rate%
136208 +1.2% 137857 perf-stat.i.context-switches
1.34 -1.0% 1.32 perf-stat.i.cpi
1.948e+10 +1.3% 1.974e+10 perf-stat.i.instructions
9.12 +2.2% 9.32 perf-stat.i.metric.K/sec
224090 +2.5% 229667 perf-stat.i.minor-faults
224090 +2.5% 229667 perf-stat.i.page-faults
1.33 -34.1% 0.88 ± 70% perf-stat.overall.cpi
714.76 -33.9% 472.47 ± 70% perf-stat.overall.cycles-between-cache-misses
1.095e+08 -34.2% 72001076 ± 70% perf-stat.ps.cache-references
15.93 -0.8 15.15 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter
15.95 -0.7 15.22 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write
14.40 -0.7 13.72 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto
14.43 -0.6 13.79 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
21.35 -0.6 20.74 perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write.ksys_write
21.50 -0.5 20.96 perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64
16.98 -0.5 16.44 perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
17.15 -0.5 16.66 perf-profile.calltrace.cycles-pp.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
21.61 -0.5 21.14 perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
22.26 -0.4 21.84 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
21.76 -0.4 21.34 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
22.24 -0.4 21.82 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
21.95 -0.4 21.53 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
22.65 -0.4 22.24 perf-profile.calltrace.cycles-pp.write
17.28 -0.4 16.87 perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
17.30 -0.4 16.92 perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
17.49 -0.4 17.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
17.51 -0.4 17.14 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__send
17.92 -0.4 17.57 perf-profile.calltrace.cycles-pp.__send
0.57 +0.0 0.62 ± 3% perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write
0.74 ± 2% +0.1 0.79 ± 3% perf-profile.calltrace.cycles-pp.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
1.34 +0.1 1.40 perf-profile.calltrace.cycles-pp.__inet_lookup_skb.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core
3.85 +0.1 3.94 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
1.87 ± 3% +0.1 1.97 perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
5.14 +0.1 5.24 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
5.14 +0.1 5.25 perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
5.54 +0.1 5.64 perf-profile.calltrace.cycles-pp.common_startup_64
5.13 +0.1 5.24 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
10.08 +0.1 10.20 perf-profile.calltrace.cycles-pp.do_epoll_ctl.__x64_sys_epoll_ctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl
10.48 +0.1 10.61 perf-profile.calltrace.cycles-pp.__x64_sys_epoll_ctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl
11.15 +0.1 11.28 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.epoll_ctl
11.03 +0.1 11.18 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl
6.79 +0.2 6.95 perf-profile.calltrace.cycles-pp.dictFind
12.44 +0.2 12.62 perf-profile.calltrace.cycles-pp.epoll_ctl
15.91 +0.2 16.10 perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
16.00 +0.2 16.21 perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
16.03 +0.2 16.24 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
16.78 +0.2 17.01 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.handle_softirqs.do_softirq
16.54 +0.2 16.78 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.handle_softirqs
16.80 +0.2 17.04 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip
20.70 +0.3 20.96 perf-profile.calltrace.cycles-pp.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip.__dev_queue_xmit
21.08 +0.3 21.34 perf-profile.calltrace.cycles-pp.handle_softirqs.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2
21.15 +0.3 21.42 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit
21.23 +0.3 21.50 perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb
22.73 +0.3 23.03 perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
23.44 +0.3 23.77 perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked
22.94 +0.3 23.28 perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
26.32 +0.4 26.68 perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg
30.37 -1.5 28.92 perf-profile.children.cycles-pp.tcp_write_xmit
30.39 -1.4 29.03 perf-profile.children.cycles-pp.__tcp_push_pending_frames
38.37 -1.1 37.23 perf-profile.children.cycles-pp.tcp_sendmsg_locked
38.66 -1.0 37.67 perf-profile.children.cycles-pp.tcp_sendmsg
1.32 -0.9 0.38 ± 2% perf-profile.children.cycles-pp.tcp_event_new_data_sent
1.80 ± 2% -0.9 0.88 ± 2% perf-profile.children.cycles-pp.tcp_check_space
1.19 ± 2% -0.9 0.27 ± 4% perf-profile.children.cycles-pp.__mod_timer
1.22 ± 2% -0.9 0.30 ± 3% perf-profile.children.cycles-pp.sk_reset_timer
66.87 -0.5 66.34 perf-profile.children.cycles-pp.do_syscall_64
67.19 -0.5 66.67 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
21.61 -0.5 21.15 perf-profile.children.cycles-pp.sock_write_iter
21.82 -0.4 21.39 perf-profile.children.cycles-pp.vfs_write
22.02 -0.4 21.60 perf-profile.children.cycles-pp.ksys_write
17.29 -0.4 16.88 perf-profile.children.cycles-pp.__sys_sendto
22.78 -0.4 22.37 perf-profile.children.cycles-pp.write
17.31 -0.4 16.94 perf-profile.children.cycles-pp.__x64_sys_sendto
18.00 -0.4 17.65 perf-profile.children.cycles-pp.__send
0.23 ± 5% -0.1 0.16 ± 6% perf-profile.children.cycles-pp.tcp_event_data_recv
0.12 ± 4% +0.0 0.13 ± 3% perf-profile.children.cycles-pp.validate_xmit_skb
0.38 ± 2% +0.0 0.40 ± 3% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.30 ± 4% +0.0 0.32 ± 3% perf-profile.children.cycles-pp.pick_next_task_fair
0.51 ± 2% +0.0 0.54 ± 2% perf-profile.children.cycles-pp._copy_from_iter
0.18 ± 5% +0.0 0.21 ± 2% perf-profile.children.cycles-pp.tcp_schedule_loss_probe
1.35 +0.1 1.40 perf-profile.children.cycles-pp.__inet_lookup_skb
1.49 +0.1 1.56 perf-profile.children.cycles-pp.tcp_stream_alloc_skb
0.76 +0.1 0.83 perf-profile.children.cycles-pp.skb_do_copy_data_nocache
4.02 +0.1 4.09 perf-profile.children.cycles-pp.cpuidle_enter
4.00 +0.1 4.07 perf-profile.children.cycles-pp.cpuidle_enter_state
0.24 ± 5% +0.1 0.32 ± 5% perf-profile.children.cycles-pp.release_sock
4.22 +0.1 4.32 perf-profile.children.cycles-pp.cpuidle_idle_call
1.93 ± 2% +0.1 2.03 perf-profile.children.cycles-pp.intel_idle
5.53 +0.1 5.63 perf-profile.children.cycles-pp.do_idle
5.14 +0.1 5.25 perf-profile.children.cycles-pp.start_secondary
5.54 +0.1 5.64 perf-profile.children.cycles-pp.common_startup_64
5.54 +0.1 5.64 perf-profile.children.cycles-pp.cpu_startup_entry
10.12 +0.1 10.24 perf-profile.children.cycles-pp.do_epoll_ctl
10.50 +0.1 10.63 perf-profile.children.cycles-pp.__x64_sys_epoll_ctl
12.76 +0.2 12.93 perf-profile.children.cycles-pp.epoll_ctl
6.88 +0.2 7.05 perf-profile.children.cycles-pp.dictFind
15.94 +0.2 16.13 perf-profile.children.cycles-pp.tcp_v4_rcv
16.04 +0.2 16.25 perf-profile.children.cycles-pp.ip_local_deliver_finish
16.02 +0.2 16.23 perf-profile.children.cycles-pp.ip_protocol_deliver_rcu
16.55 +0.2 16.78 perf-profile.children.cycles-pp.__netif_receive_skb_one_core
16.81 +0.2 17.05 perf-profile.children.cycles-pp.__napi_poll
16.78 +0.2 17.02 perf-profile.children.cycles-pp.process_backlog
21.54 +0.3 21.79 perf-profile.children.cycles-pp.handle_softirqs
20.72 +0.3 20.98 perf-profile.children.cycles-pp.net_rx_action
21.16 +0.3 21.43 perf-profile.children.cycles-pp.do_softirq
21.31 +0.3 21.60 perf-profile.children.cycles-pp.__local_bh_enable_ip
22.76 +0.3 23.06 perf-profile.children.cycles-pp.__dev_queue_xmit
22.96 +0.3 23.29 perf-profile.children.cycles-pp.ip_finish_output2
23.46 +0.3 23.80 perf-profile.children.cycles-pp.__ip_queue_xmit
26.38 +0.3 26.72 perf-profile.children.cycles-pp.__tcp_transmit_skb
1.13 ± 2% -1.0 0.18 ± 3% perf-profile.self.cycles-pp.__mod_timer
1.79 ± 2% -0.9 0.87 ± 2% perf-profile.self.cycles-pp.tcp_check_space
0.22 ± 6% -0.1 0.15 ± 7% perf-profile.self.cycles-pp.tcp_event_data_recv
0.48 +0.0 0.50 ± 2% perf-profile.self.cycles-pp.mod_objcg_state
0.32 ± 2% +0.0 0.34 perf-profile.self.cycles-pp.call
0.50 ± 2% +0.0 0.52 ± 2% perf-profile.self.cycles-pp._copy_from_iter
0.17 ± 4% +0.0 0.19 ± 3% perf-profile.self.cycles-pp.ip_finish_output2
0.11 ± 9% +0.0 0.14 ± 3% perf-profile.self.cycles-pp.tcp_event_new_data_sent
0.27 ± 4% +0.0 0.30 ± 3% perf-profile.self.cycles-pp.__alloc_skb
0.21 +0.0 0.24 ± 3% perf-profile.self.cycles-pp.kfree_skbmem
0.36 ± 3% +0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_bh
0.56 ± 2% +0.0 0.60 ± 2% perf-profile.self.cycles-pp.kmem_cache_free
0.13 ± 5% +0.0 0.17 ± 5% perf-profile.self.cycles-pp.vfs_write
0.00 +0.1 0.06 ± 9% perf-profile.self.cycles-pp.__x64_sys_sendto
0.08 ± 8% +0.1 0.14 ± 4% perf-profile.self.cycles-pp.sock_write_iter
0.02 ± 99% +0.1 0.09 ± 11% perf-profile.self.cycles-pp.__sys_sendto
3.40 +0.1 3.50 perf-profile.self.cycles-pp.tcp_sendmsg_locked
1.93 ± 2% +0.1 2.03 perf-profile.self.cycles-pp.intel_idle
0.00 +0.1 0.11 ± 5% perf-profile.self.cycles-pp.__tcp_push_pending_frames
6.66 +0.1 6.78 perf-profile.self.cycles-pp.dictFind
0.00 +0.1 0.13 ± 2% perf-profile.self.cycles-pp.tcp_sendmsg
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists