lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202502051330.4d2f403b-lkp@intel.com>
Date: Wed, 5 Feb 2025 14:13:48 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Vadim Fedorenko <vadfed@...a.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Jakub Kicinski <kuba@...nel.org>, Willem de Bruijn <willemb@...gle.com>,
	Jason Xing <kerneljasonxing@...il.com>, <netdev@...r.kernel.org>,
	<linux-alpha@...r.kernel.org>, <linux-mips@...r.kernel.org>,
	<linux-parisc@...r.kernel.org>, <sparclinux@...r.kernel.org>,
	<linux-arch@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linus:master] [net_tstamp]  4aecca4c76:
 redis.get_total_throughput_rps 1.5% improvement


Hello,

kernel test robot noticed a 1.5% improvement of redis.get_total_throughput_rps on:


commit: 4aecca4c76808f3736056d18ff510df80424bc9f ("net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: redis
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	all: 1
	sc_overcommit_memory: 1
	sc_somaxconn: 65535
	thp_enabled: never
	thp_defrag: never
	cluster: cs-localhost
	cpu_node_bind: even
	nr_processes: 4
	test: set,get
	data_size: 1024
	n_client: 5
	requests: 68000000
	n_pipeline: 3
	key_len: 68000000
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250205/202502051330.4d2f403b-lkp@intel.com

=========================================================================================
all/cluster/compiler/cpu_node_bind/cpufreq_governor/data_size/kconfig/key_len/n_client/n_pipeline/nr_processes/requests/rootfs/sc_overcommit_memory/sc_somaxconn/tbox_group/test/testcase/thp_defrag/thp_enabled:
  1/cs-localhost/gcc-12/even/performance/1024/x86_64-rhel-9.4/68000000/5/3/4/68000000/debian-12-x86_64-20240206.cgz/1/65535/lkp-icl-2sp7/set,get/redis/never/never

commit: 
  34ea1df802 ("Merge branch 'net-mlx5-hw-counters-refactor'")
  4aecca4c76 ("net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message")

34ea1df802f79d44 4aecca4c76808f3736056d18ff5 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  18491785            +2.1%   18880098        proc-vmstat.numa_hint_faults
  18483590            +2.0%   18850441        proc-vmstat.numa_hint_faults_local
      8589 ± 97%    +255.7%      30553 ± 17%  proc-vmstat.numa_pages_migrated
  21039386            +2.2%   21505792        proc-vmstat.numa_pte_updates
      8589 ± 97%    +255.7%      30553 ± 17%  proc-vmstat.pgmigrate_success
     25696 ± 12%     +14.4%      29397        proc-vmstat.pgreuse
    252371            +1.5%     256108        redis.get_avg_throughput_rps
     67.36            -1.5%      66.38        redis.get_avg_time_sec
   1009486            +1.5%    1024432        redis.get_total_throughput_rps
    269.45            -1.5%     265.52        redis.get_total_time_sec
    257.67            -1.1%     254.83        redis.time.percent_of_cpu_this_job_got
    337.27            -2.4%     329.05        redis.time.system_time
 3.957e+09            +1.3%  4.008e+09        perf-stat.i.branch-instructions
  38469227            +1.6%   39070923        perf-stat.i.branch-misses
     32.20            +0.8       33.01        perf-stat.i.cache-miss-rate%
    136208            +1.2%     137857        perf-stat.i.context-switches
      1.34            -1.0%       1.32        perf-stat.i.cpi
 1.948e+10            +1.3%  1.974e+10        perf-stat.i.instructions
      9.12            +2.2%       9.32        perf-stat.i.metric.K/sec
    224090            +2.5%     229667        perf-stat.i.minor-faults
    224090            +2.5%     229667        perf-stat.i.page-faults
      1.33           -34.1%       0.88 ± 70%  perf-stat.overall.cpi
    714.76           -33.9%     472.47 ± 70%  perf-stat.overall.cycles-between-cache-misses
 1.095e+08           -34.2%   72001076 ± 70%  perf-stat.ps.cache-references
     15.93            -0.8       15.15        perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter
     15.95            -0.7       15.22        perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write
     14.40            -0.7       13.72        perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto
     14.43            -0.6       13.79        perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
     21.35            -0.6       20.74        perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write.ksys_write
     21.50            -0.5       20.96        perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64
     16.98            -0.5       16.44        perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
     17.15            -0.5       16.66        perf-profile.calltrace.cycles-pp.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
     21.61            -0.5       21.14        perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22.26            -0.4       21.84        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
     21.76            -0.4       21.34        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     22.24            -0.4       21.82        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     21.95            -0.4       21.53        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     22.65            -0.4       22.24        perf-profile.calltrace.cycles-pp.write
     17.28            -0.4       16.87        perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
     17.30            -0.4       16.92        perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
     17.49            -0.4       17.12        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
     17.51            -0.4       17.14        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__send
     17.92            -0.4       17.57        perf-profile.calltrace.cycles-pp.__send
      0.57            +0.0        0.62 ±  3%  perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write
      0.74 ±  2%      +0.1        0.79 ±  3%  perf-profile.calltrace.cycles-pp.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
      1.34            +0.1        1.40        perf-profile.calltrace.cycles-pp.__inet_lookup_skb.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core
      3.85            +0.1        3.94        perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      1.87 ±  3%      +0.1        1.97        perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.14            +0.1        5.24        perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
      5.14            +0.1        5.25        perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
      5.54            +0.1        5.64        perf-profile.calltrace.cycles-pp.common_startup_64
      5.13            +0.1        5.24        perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
     10.08            +0.1       10.20        perf-profile.calltrace.cycles-pp.do_epoll_ctl.__x64_sys_epoll_ctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl
     10.48            +0.1       10.61        perf-profile.calltrace.cycles-pp.__x64_sys_epoll_ctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl
     11.15            +0.1       11.28        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.epoll_ctl
     11.03            +0.1       11.18        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl
      6.79            +0.2        6.95        perf-profile.calltrace.cycles-pp.dictFind
     12.44            +0.2       12.62        perf-profile.calltrace.cycles-pp.epoll_ctl
     15.91            +0.2       16.10        perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
     16.00            +0.2       16.21        perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
     16.03            +0.2       16.24        perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
     16.78            +0.2       17.01        perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.handle_softirqs.do_softirq
     16.54            +0.2       16.78        perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.handle_softirqs
     16.80            +0.2       17.04        perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip
     20.70            +0.3       20.96        perf-profile.calltrace.cycles-pp.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip.__dev_queue_xmit
     21.08            +0.3       21.34        perf-profile.calltrace.cycles-pp.handle_softirqs.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2
     21.15            +0.3       21.42        perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit
     21.23            +0.3       21.50        perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb
     22.73            +0.3       23.03        perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
     23.44            +0.3       23.77        perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked
     22.94            +0.3       23.28        perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
     26.32            +0.4       26.68        perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg
     30.37            -1.5       28.92        perf-profile.children.cycles-pp.tcp_write_xmit
     30.39            -1.4       29.03        perf-profile.children.cycles-pp.__tcp_push_pending_frames
     38.37            -1.1       37.23        perf-profile.children.cycles-pp.tcp_sendmsg_locked
     38.66            -1.0       37.67        perf-profile.children.cycles-pp.tcp_sendmsg
      1.32            -0.9        0.38 ±  2%  perf-profile.children.cycles-pp.tcp_event_new_data_sent
      1.80 ±  2%      -0.9        0.88 ±  2%  perf-profile.children.cycles-pp.tcp_check_space
      1.19 ±  2%      -0.9        0.27 ±  4%  perf-profile.children.cycles-pp.__mod_timer
      1.22 ±  2%      -0.9        0.30 ±  3%  perf-profile.children.cycles-pp.sk_reset_timer
     66.87            -0.5       66.34        perf-profile.children.cycles-pp.do_syscall_64
     67.19            -0.5       66.67        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     21.61            -0.5       21.15        perf-profile.children.cycles-pp.sock_write_iter
     21.82            -0.4       21.39        perf-profile.children.cycles-pp.vfs_write
     22.02            -0.4       21.60        perf-profile.children.cycles-pp.ksys_write
     17.29            -0.4       16.88        perf-profile.children.cycles-pp.__sys_sendto
     22.78            -0.4       22.37        perf-profile.children.cycles-pp.write
     17.31            -0.4       16.94        perf-profile.children.cycles-pp.__x64_sys_sendto
     18.00            -0.4       17.65        perf-profile.children.cycles-pp.__send
      0.23 ±  5%      -0.1        0.16 ±  6%  perf-profile.children.cycles-pp.tcp_event_data_recv
      0.12 ±  4%      +0.0        0.13 ±  3%  perf-profile.children.cycles-pp.validate_xmit_skb
      0.38 ±  2%      +0.0        0.40 ±  3%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.30 ±  4%      +0.0        0.32 ±  3%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.51 ±  2%      +0.0        0.54 ±  2%  perf-profile.children.cycles-pp._copy_from_iter
      0.18 ±  5%      +0.0        0.21 ±  2%  perf-profile.children.cycles-pp.tcp_schedule_loss_probe
      1.35            +0.1        1.40        perf-profile.children.cycles-pp.__inet_lookup_skb
      1.49            +0.1        1.56        perf-profile.children.cycles-pp.tcp_stream_alloc_skb
      0.76            +0.1        0.83        perf-profile.children.cycles-pp.skb_do_copy_data_nocache
      4.02            +0.1        4.09        perf-profile.children.cycles-pp.cpuidle_enter
      4.00            +0.1        4.07        perf-profile.children.cycles-pp.cpuidle_enter_state
      0.24 ±  5%      +0.1        0.32 ±  5%  perf-profile.children.cycles-pp.release_sock
      4.22            +0.1        4.32        perf-profile.children.cycles-pp.cpuidle_idle_call
      1.93 ±  2%      +0.1        2.03        perf-profile.children.cycles-pp.intel_idle
      5.53            +0.1        5.63        perf-profile.children.cycles-pp.do_idle
      5.14            +0.1        5.25        perf-profile.children.cycles-pp.start_secondary
      5.54            +0.1        5.64        perf-profile.children.cycles-pp.common_startup_64
      5.54            +0.1        5.64        perf-profile.children.cycles-pp.cpu_startup_entry
     10.12            +0.1       10.24        perf-profile.children.cycles-pp.do_epoll_ctl
     10.50            +0.1       10.63        perf-profile.children.cycles-pp.__x64_sys_epoll_ctl
     12.76            +0.2       12.93        perf-profile.children.cycles-pp.epoll_ctl
      6.88            +0.2        7.05        perf-profile.children.cycles-pp.dictFind
     15.94            +0.2       16.13        perf-profile.children.cycles-pp.tcp_v4_rcv
     16.04            +0.2       16.25        perf-profile.children.cycles-pp.ip_local_deliver_finish
     16.02            +0.2       16.23        perf-profile.children.cycles-pp.ip_protocol_deliver_rcu
     16.55            +0.2       16.78        perf-profile.children.cycles-pp.__netif_receive_skb_one_core
     16.81            +0.2       17.05        perf-profile.children.cycles-pp.__napi_poll
     16.78            +0.2       17.02        perf-profile.children.cycles-pp.process_backlog
     21.54            +0.3       21.79        perf-profile.children.cycles-pp.handle_softirqs
     20.72            +0.3       20.98        perf-profile.children.cycles-pp.net_rx_action
     21.16            +0.3       21.43        perf-profile.children.cycles-pp.do_softirq
     21.31            +0.3       21.60        perf-profile.children.cycles-pp.__local_bh_enable_ip
     22.76            +0.3       23.06        perf-profile.children.cycles-pp.__dev_queue_xmit
     22.96            +0.3       23.29        perf-profile.children.cycles-pp.ip_finish_output2
     23.46            +0.3       23.80        perf-profile.children.cycles-pp.__ip_queue_xmit
     26.38            +0.3       26.72        perf-profile.children.cycles-pp.__tcp_transmit_skb
      1.13 ±  2%      -1.0        0.18 ±  3%  perf-profile.self.cycles-pp.__mod_timer
      1.79 ±  2%      -0.9        0.87 ±  2%  perf-profile.self.cycles-pp.tcp_check_space
      0.22 ±  6%      -0.1        0.15 ±  7%  perf-profile.self.cycles-pp.tcp_event_data_recv
      0.48            +0.0        0.50 ±  2%  perf-profile.self.cycles-pp.mod_objcg_state
      0.32 ±  2%      +0.0        0.34        perf-profile.self.cycles-pp.call
      0.50 ±  2%      +0.0        0.52 ±  2%  perf-profile.self.cycles-pp._copy_from_iter
      0.17 ±  4%      +0.0        0.19 ±  3%  perf-profile.self.cycles-pp.ip_finish_output2
      0.11 ±  9%      +0.0        0.14 ±  3%  perf-profile.self.cycles-pp.tcp_event_new_data_sent
      0.27 ±  4%      +0.0        0.30 ±  3%  perf-profile.self.cycles-pp.__alloc_skb
      0.21            +0.0        0.24 ±  3%  perf-profile.self.cycles-pp.kfree_skbmem
      0.36 ±  3%      +0.0        0.40        perf-profile.self.cycles-pp._raw_spin_lock_bh
      0.56 ±  2%      +0.0        0.60 ±  2%  perf-profile.self.cycles-pp.kmem_cache_free
      0.13 ±  5%      +0.0        0.17 ±  5%  perf-profile.self.cycles-pp.vfs_write
      0.00            +0.1        0.06 ±  9%  perf-profile.self.cycles-pp.__x64_sys_sendto
      0.08 ±  8%      +0.1        0.14 ±  4%  perf-profile.self.cycles-pp.sock_write_iter
      0.02 ± 99%      +0.1        0.09 ± 11%  perf-profile.self.cycles-pp.__sys_sendto
      3.40            +0.1        3.50        perf-profile.self.cycles-pp.tcp_sendmsg_locked
      1.93 ±  2%      +0.1        2.03        perf-profile.self.cycles-pp.intel_idle
      0.00            +0.1        0.11 ±  5%  perf-profile.self.cycles-pp.__tcp_push_pending_frames
      6.66            +0.1        6.78        perf-profile.self.cycles-pp.dictFind
      0.00            +0.1        0.13 ±  2%  perf-profile.self.cycles-pp.tcp_sendmsg




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ