[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202303071109.e9d1462d-yujie.liu@intel.com>
Date: Tue, 7 Mar 2023 11:07:25 +0800
From: kernel test robot <yujie.liu@...el.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>,
Ido Schimmel <idosch@...dia.com>, <netdev@...r.kernel.org>,
<ying.huang@...el.com>, <feng.tang@...el.com>,
<zhengjun.xing@...ux.intel.com>, <fengwei.yin@...el.com>
Subject: [linus:master] [net] f3412b3879: redis.set_total_throughput_rps
-8.6% regression
Greeting,
FYI, we noticed a -8.6% regression of redis.set_total_throughput_rps due to commit:
commit: f3412b3879b4f7c4313b186b03940d4791345534 ("net: make sure net_rx_action() calls skb_defer_free_flush()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: redis
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory
with following parameters:
all: 1
sc_overcommit_memory: 1
sc_somaxconn: 65535
thp_enabled: never
thp_defrag: never
cluster: cs-localhost
cpu_node_bind: even
nr_processes: 4
test: set,get
data_size: 1024
n_client: 5
requests: 68000000
n_pipeline: 3
key_len: 68000000
cpufreq_governor: performance
test-description: Redis benchmark is the utility to check the performance of Redis by running commands done by N clients at the same time sending M total queries (it is similar to the Apache's ab utility).
test-url: https://redis.io/topics/benchmarks
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <yujie.liu@...el.com>
| Link: https://lore.kernel.org/oe-lkp/202303071109.e9d1462d-yujie.liu@intel.com
Details are as below:
=========================================================================================
all/cluster/compiler/cpu_node_bind/cpufreq_governor/data_size/kconfig/key_len/n_client/n_pipeline/nr_processes/requests/rootfs/sc_overcommit_memory/sc_somaxconn/tbox_group/test/testcase/thp_defrag/thp_enabled:
1/cs-localhost/gcc-11/even/performance/1024/x86_64-rhel-8.3/68000000/5/3/4/68000000/debian-11.1-x86_64-20220510.cgz/1/65535/lkp-csl-2sp7/set,get/redis/never/never
commit:
be5fd933f8 ("Merge branch 'add-reset-deassertion-for-aspeed-mdio'")
f3412b3879 ("net: make sure net_rx_action() calls skb_defer_free_flush()")
be5fd933f8c15967 f3412b3879b4f7c4313b186b039
---------------- ---------------------------
%stddev %change %stddev
\ | \
252015 -9.5% 227984 ± 2% redis.get_avg_throughput_rps
67.46 +10.6% 74.62 ± 2% redis.get_avg_time_sec
756045 -9.5% 683953 ± 2% redis.get_total_throughput_rps
202.38 +10.6% 223.86 ± 2% redis.get_total_time_sec
205530 -8.6% 187839 redis.set_avg_throughput_rps
82.71 +9.5% 90.53 redis.set_avg_time_sec
616591 -8.6% 563518 redis.set_total_throughput_rps
248.14 +9.5% 271.59 redis.set_total_time_sec
154.24 +9.6% 169.06 ± 2% redis.time.elapsed_time
154.24 +9.6% 169.06 ± 2% redis.time.elapsed_time.max
42820 ± 3% +18.7% 50810 ± 8% redis.time.involuntary_context_switches
263.43 +5.4% 277.60 redis.time.system_time
1.17 +0.3 1.50 ± 2% mpstat.cpu.all.soft%
3.952e+08 +23.9% 4.898e+08 vmstat.memory.free
0.35 ± 10% -0.1 0.27 ± 14% turbostat.C1%
21655037 ± 24% +41.0% 30533175 ± 2% turbostat.C1E
8586843 -26.1% 6343681 numa-numastat.node0.local_node
8614193 -25.9% 6381593 numa-numastat.node0.numa_hit
11431037 -18.6% 9300164 numa-numastat.node1.local_node
11488598 -18.6% 9350032 numa-numastat.node1.numa_hit
3.939e+08 +23.8% 4.877e+08 meminfo.MemAvailable
3.961e+08 +23.7% 4.898e+08 meminfo.MemFree
1.32e+08 -71.0% 38219657 meminfo.Memused
48034792 -99.6% 186078 meminfo.SUnreclaim
48141413 -99.4% 292816 meminfo.Slab
1.32e+08 -70.8% 38570676 meminfo.max_used_kB
1.968e+08 +26.4% 2.487e+08 numa-meminfo.node0.MemFree
67057835 -77.4% 15180098 numa-meminfo.node0.MemUsed
24036023 ± 2% -99.5% 110016 ± 3% numa-meminfo.node0.SUnreclaim
24124421 ± 2% -99.2% 196729 ± 2% numa-meminfo.node0.Slab
22960 ±127% +211.3% 71480 ± 43% numa-meminfo.node1.Inactive(file)
1.992e+08 +21.0% 2.412e+08 numa-meminfo.node1.MemFree
64969647 -64.5% 23043393 numa-meminfo.node1.MemUsed
24021896 -99.7% 76037 ± 5% numa-meminfo.node1.SUnreclaim
24040119 -99.6% 96063 ± 6% numa-meminfo.node1.Slab
49193940 +26.4% 62165766 numa-vmstat.node0.nr_free_pages
6010453 ± 2% -99.5% 27506 ± 3% numa-vmstat.node0.nr_slab_unreclaimable
8614420 -25.9% 6381821 numa-vmstat.node0.numa_hit
8587070 -26.1% 6343909 numa-vmstat.node0.numa_local
49803307 +21.1% 60287281 numa-vmstat.node1.nr_free_pages
5740 ±127% +211.8% 17896 ± 43% numa-vmstat.node1.nr_inactive_file
6006861 -99.7% 19010 ± 5% numa-vmstat.node1.nr_slab_unreclaimable
5740 ±127% +211.8% 17896 ± 43% numa-vmstat.node1.nr_zone_inactive_file
11488668 -18.6% 9350066 numa-vmstat.node1.numa_hit
11431107 -18.6% 9300198 numa-vmstat.node1.numa_local
520.47 ± 25% +55.0% 806.75 ± 20% sched_debug.cfs_rq:/.load_avg.max
97.63 ± 24% +48.1% 144.55 ± 19% sched_debug.cfs_rq:/.load_avg.stddev
56.18 ± 64% +113.7% 120.03 ± 31% sched_debug.cfs_rq:/.removed.load_avg.stddev
3.31 ± 78% +193.3% 9.71 ± 39% sched_debug.cfs_rq:/.removed.runnable_avg.avg
21.14 ± 70% +147.9% 52.39 ± 37% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
3.31 ± 78% +193.3% 9.71 ± 39% sched_debug.cfs_rq:/.removed.util_avg.avg
21.14 ± 70% +147.9% 52.39 ± 37% sched_debug.cfs_rq:/.removed.util_avg.stddev
47802 ± 32% -55.9% 21078 ± 36% sched_debug.cfs_rq:/.spread0.max
7311 ± 4% -21.3% 5756 ± 10% sched_debug.cpu.avg_idle.min
752476 ± 23% -66.9% 249133 ± 29% sched_debug.cpu.nr_switches.max
119990 ± 21% -54.4% 54695 ± 24% sched_debug.cpu.nr_switches.stddev
1883 ± 3% +9.6% 2064 ± 8% proc-vmstat.nr_active_anon
9833460 +23.8% 12175527 proc-vmstat.nr_dirty_background_threshold
19690964 +23.8% 24380824 proc-vmstat.nr_dirty_threshold
99012757 +23.7% 1.225e+08 proc-vmstat.nr_free_pages
5515 ± 3% -11.1% 4901 ± 13% proc-vmstat.nr_shmem
12009233 -99.6% 46518 proc-vmstat.nr_slab_unreclaimable
1883 ± 3% +9.6% 2064 ± 8% proc-vmstat.nr_zone_active_anon
20104413 -21.7% 15734336 proc-vmstat.numa_hit
20021345 -21.8% 15647296 proc-vmstat.numa_local
20106105 -21.7% 15735973 proc-vmstat.pgalloc_normal
2391744 ± 3% +105.8% 4922820 proc-vmstat.pgfree
24271 +5.9% 25697 ± 2% proc-vmstat.pgreuse
2.685e+09 -7.1% 2.494e+09 ± 2% perf-stat.i.branch-instructions
1.10 +0.0 1.12 perf-stat.i.branch-miss-rate%
30474911 -5.1% 28916630 ± 3% perf-stat.i.branch-misses
86117392 -6.2% 80770756 ± 2% perf-stat.i.cache-misses
1.039e+08 -7.1% 96556388 ± 2% perf-stat.i.cache-references
1.40 +8.7% 1.52 ± 3% perf-stat.i.cpi
220.93 +7.0% 236.43 ± 3% perf-stat.i.cycles-between-cache-misses
3.851e+09 -7.4% 3.568e+09 ± 2% perf-stat.i.dTLB-loads
606669 -7.8% 559360 perf-stat.i.dTLB-store-misses
2.094e+09 -7.4% 1.94e+09 ± 2% perf-stat.i.dTLB-stores
16798215 -5.8% 15820186 ± 2% perf-stat.i.iTLB-load-misses
1.353e+10 -7.2% 1.256e+10 ± 2% perf-stat.i.instructions
0.72 -7.6% 0.67 ± 3% perf-stat.i.ipc
657.86 ± 5% +52.4% 1002 ± 17% perf-stat.i.metric.K/sec
90.85 -7.7% 83.88 ± 2% perf-stat.i.metric.M/sec
167749 -7.1% 155916 perf-stat.i.minor-faults
74.54 +3.6 78.17 perf-stat.i.node-load-miss-rate%
26164317 +6.4% 27849868 ± 2% perf-stat.i.node-load-misses
9207088 -14.1% 7912609 perf-stat.i.node-loads
57.34 ± 11% +26.2 83.58 perf-stat.i.node-store-miss-rate%
6626833 ± 12% +41.2% 9354265 ± 3% perf-stat.i.node-store-misses
5283168 ± 14% -60.9% 2064586 ± 2% perf-stat.i.node-stores
167752 -7.1% 155918 perf-stat.i.page-faults
1.14 +0.0 1.16 perf-stat.overall.branch-miss-rate%
1.39 +8.5% 1.51 ± 3% perf-stat.overall.cpi
218.81 +7.3% 234.87 ± 3% perf-stat.overall.cycles-between-cache-misses
0.72 -7.8% 0.66 ± 3% perf-stat.overall.ipc
73.98 +3.9 77.87 perf-stat.overall.node-load-miss-rate%
55.58 ± 11% +26.3 81.92 perf-stat.overall.node-store-miss-rate%
2.667e+09 -7.1% 2.479e+09 ± 2% perf-stat.ps.branch-instructions
30284745 -5.1% 28735640 ± 3% perf-stat.ps.branch-misses
85544824 -6.1% 80295476 ± 2% perf-stat.ps.cache-misses
1.032e+08 -7.0% 95982345 ± 2% perf-stat.ps.cache-references
3.826e+09 -7.3% 3.547e+09 ± 2% perf-stat.ps.dTLB-loads
603282 -7.8% 555963 perf-stat.ps.dTLB-store-misses
2.081e+09 -7.3% 1.928e+09 ± 2% perf-stat.ps.dTLB-stores
16687441 -5.7% 15728404 ± 2% perf-stat.ps.iTLB-load-misses
1.345e+10 -7.2% 1.248e+10 ± 2% perf-stat.ps.instructions
166736 -7.0% 154987 perf-stat.ps.minor-faults
25993409 +6.5% 27688192 ± 2% perf-stat.ps.node-load-misses
9142252 -14.0% 7866166 perf-stat.ps.node-loads
6583167 ± 12% +41.3% 9299683 ± 3% perf-stat.ps.node-store-misses
5252541 ± 14% -61.0% 2051094 ± 2% perf-stat.ps.node-stores
166739 -7.0% 154989 perf-stat.ps.page-faults
2.078e+12 +1.7% 2.114e+12 perf-stat.total.instructions
1.38 ± 9% -1.0 0.36 ± 70% perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
1.45 ± 7% -0.3 1.12 ± 17% perf-profile.calltrace.cycles-pp.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
1.01 ± 9% -0.2 0.85 ± 11% perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.sock_read_iter
0.00 +0.7 0.65 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip
0.00 +0.7 0.67 ± 29% perf-profile.calltrace.cycles-pp.__kfree_skb.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip
0.00 +0.9 0.86 ± 29% perf-profile.calltrace.cycles-pp.skb_release_data.__kfree_skb.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established
0.00 +0.9 0.91 ± 29% perf-profile.calltrace.cycles-pp.__kfree_skb.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv
1.50 ± 11% +2.0 3.45 ± 4% perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
2.76 ± 9% +2.0 4.72 ± 5% perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
11.76 ± 7% +2.4 14.18 ± 6% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
12.09 ± 7% +2.5 14.57 ± 5% perf-profile.calltrace.cycles-pp.__libc_write
11.93 ± 7% +2.5 14.42 ± 5% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write
11.87 ± 7% +2.5 14.36 ± 5% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
14.67 ± 7% +3.7 18.40 ± 7% perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
12.72 ± 7% +4.1 16.84 ± 7% perf-profile.calltrace.cycles-pp.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2
13.00 ± 7% +4.1 17.13 ± 7% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit
13.05 ± 7% +4.1 17.18 ± 7% perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb
13.06 ± 7% +4.1 17.21 ± 7% perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
0.68 ± 9% -0.4 0.25 ± 19% perf-profile.children.cycles-pp.tcp_cleanup_rbuf
1.39 ± 9% -0.4 0.97 ± 7% perf-profile.children.cycles-pp.__dev_queue_xmit
0.54 ± 9% -0.3 0.20 ± 13% perf-profile.children.cycles-pp.___slab_alloc
1.46 ± 8% -0.3 1.20 ± 4% perf-profile.children.cycles-pp.__alloc_skb
0.44 ± 9% -0.2 0.28 ± 7% perf-profile.children.cycles-pp.kmalloc_reserve
0.42 ± 9% -0.2 0.27 ± 7% perf-profile.children.cycles-pp.__kmalloc_node_track_caller
0.35 ± 10% -0.2 0.20 ± 8% perf-profile.children.cycles-pp.kmem_cache_alloc_node
0.14 ± 13% -0.1 0.08 ± 10% perf-profile.children.cycles-pp.__alloc_pages
0.13 ± 12% -0.1 0.07 ± 18% perf-profile.children.cycles-pp.get_page_from_freelist
0.13 ± 9% -0.0 0.10 ± 14% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.08 ± 7% +0.0 0.10 ± 10% perf-profile.children.cycles-pp.obj_cgroup_charge
0.09 ± 12% +0.1 0.16 ± 9% perf-profile.children.cycles-pp.validate_xmit_skb
0.00 +0.1 0.09 ± 8% perf-profile.children.cycles-pp.free_unref_page
0.46 ± 16% +0.1 0.57 ± 7% perf-profile.children.cycles-pp.skb_page_frag_refill
0.46 ± 15% +0.1 0.58 ± 8% perf-profile.children.cycles-pp.sk_page_frag_refill
0.71 ± 11% +0.1 0.85 ± 7% perf-profile.children.cycles-pp.__skb_clone
0.05 ± 9% +0.2 0.24 ± 22% perf-profile.children.cycles-pp.kfree_skbmem
0.01 ±200% +0.3 0.32 ± 4% perf-profile.children.cycles-pp.kfree
0.14 ± 12% +0.4 0.55 perf-profile.children.cycles-pp.__ksize
0.00 +0.6 0.55 ± 4% perf-profile.children.cycles-pp.dst_release
0.04 ± 90% +0.6 0.60 ± 5% perf-profile.children.cycles-pp.skb_release_head_state
0.14 ± 21% +0.6 0.72 ± 8% perf-profile.children.cycles-pp.__slab_free
0.14 ± 17% +0.6 0.72 ± 8% perf-profile.children.cycles-pp.skb_attempt_defer_free
0.78 ± 6% +0.6 1.36 ± 8% perf-profile.children.cycles-pp.kmem_cache_free
0.29 ± 28% +1.2 1.47 ± 5% perf-profile.children.cycles-pp.skb_release_data
1.14 ± 8% +1.5 2.66 ± 5% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.55 ± 23% +1.6 2.14 ± 5% perf-profile.children.cycles-pp.__kfree_skb
1.51 ± 11% +2.0 3.47 ± 4% perf-profile.children.cycles-pp.tcp_clean_rtx_queue
2.77 ± 9% +2.0 4.74 ± 5% perf-profile.children.cycles-pp.tcp_ack
14.68 ± 7% +3.7 18.40 ± 7% perf-profile.children.cycles-pp.ip_finish_output2
14.24 ± 7% +3.9 18.13 ± 7% perf-profile.children.cycles-pp.__softirqentry_text_start
13.05 ± 7% +4.1 17.18 ± 7% perf-profile.children.cycles-pp.do_softirq
12.72 ± 7% +4.1 16.86 ± 7% perf-profile.children.cycles-pp.net_rx_action
13.11 ± 7% +4.1 17.26 ± 7% perf-profile.children.cycles-pp.__local_bh_enable_ip
0.83 ± 10% -0.5 0.34 ± 8% perf-profile.self.cycles-pp.__dev_queue_xmit
0.66 ± 9% -0.4 0.24 ± 18% perf-profile.self.cycles-pp.tcp_cleanup_rbuf
0.53 ± 8% -0.4 0.16 ± 15% perf-profile.self.cycles-pp.__alloc_skb
0.48 ± 10% -0.2 0.25 ± 12% perf-profile.self.cycles-pp.__ip_queue_xmit
0.20 ± 19% -0.1 0.06 ± 48% perf-profile.self.cycles-pp.__kfree_skb
0.41 ± 18% -0.1 0.28 ± 16% perf-profile.self.cycles-pp.tcp_rtt_estimator
0.14 ± 6% -0.0 0.11 ± 25% perf-profile.self.cycles-pp.orc_find
0.12 ± 17% +0.0 0.15 ± 7% perf-profile.self.cycles-pp.__kmalloc_node_track_caller
0.10 ± 11% +0.1 0.15 ± 14% perf-profile.self.cycles-pp.___slab_alloc
0.02 ±122% +0.1 0.10 ± 8% perf-profile.self.cycles-pp.validate_xmit_skb
0.00 +0.1 0.09 ± 14% perf-profile.self.cycles-pp.kfree
0.40 ± 16% +0.1 0.50 ± 10% perf-profile.self.cycles-pp.skb_page_frag_refill
0.65 ± 10% +0.1 0.80 ± 7% perf-profile.self.cycles-pp.__skb_clone
0.05 ± 9% +0.2 0.22 ± 17% perf-profile.self.cycles-pp.kfree_skbmem
0.14 ± 16% +0.2 0.34 ± 10% perf-profile.self.cycles-pp.kmem_cache_free
0.14 ± 11% +0.4 0.54 ± 2% perf-profile.self.cycles-pp.__ksize
0.00 +0.5 0.54 ± 5% perf-profile.self.cycles-pp.dst_release
0.14 ± 21% +0.6 0.70 ± 10% perf-profile.self.cycles-pp.__slab_free
0.36 ± 12% +0.6 1.00 ± 5% perf-profile.self.cycles-pp.tcp_clean_rtx_queue
0.15 ± 6% +0.7 0.89 ± 8% perf-profile.self.cycles-pp.net_rx_action
0.24 ± 21% +0.8 1.05 ± 7% perf-profile.self.cycles-pp.skb_release_data
1.12 ± 7% +1.5 2.64 ± 5% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
View attachment "config-5.18.0-rc3-00740-gf3412b3879b4" of type "text/plain" (162375 bytes)
View attachment "job-script" of type "text/plain" (8499 bytes)
View attachment "job.yaml" of type "text/plain" (5844 bytes)
View attachment "reproduce" of type "text/plain" (1313 bytes)
Powered by blists - more mailing lists