[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20220228155733.GF1643@xsang-OptiPlex-9020>
Date: Mon, 28 Feb 2022 23:57:33 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Mel Gorman <mgorman@...hsingularity.net>
Cc: lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
feng.tang@...el.com, zhengjun.xing@...ux.intel.com,
fengwei.yin@...el.com, LKML <linux-kernel@...r.kernel.org>
Subject: [mm/page_alloc] 39907a939a: netperf.Throughput_Mbps -18.1%
regression
Greeting,
FYI, we noticed a -18.1% regression of netperf.Throughput_Mbps due to commit:
commit: 39907a939a34033eeea112751f0e4330628d3a9a ("mm/page_alloc: Limit number of high-order pages on PCP during bulk free")
https://git.kernel.org/cgit/linux/kernel/git/mel/linux.git mm-pcpllist-v1r2
in testcase: netperf
on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory
with following parameters:
ip: ipv4
runtime: 300s
nr_threads: 1
cluster: cs-localhost
test: UDP_STREAM
cpufreq_governor: performance
ucode: 0xd000331
test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
cs-localhost/gcc-9/performance/ipv4/x86_64-rhel-8.3/1/debian-10.4-x86_64-20200603.cgz/300s/lkp-icl-2sp4/UDP_STREAM/netperf/0xd000331
commit:
2009ed59ab ("mm/page_alloc: Free pages in a single pass during bulk free")
39907a939a ("mm/page_alloc: Limit number of high-order pages on PCP during bulk free")
2009ed59ab8200e6 39907a939a34033eeea112751f0
---------------- ---------------------------
%stddev %change %stddev
\ | \
122291 -18.1% 100120 netperf.Throughput_Mbps
122291 -18.1% 100120 netperf.Throughput_total_Mbps
90.83 -2.0% 89.00 netperf.time.percent_of_cpu_this_job_got
70006621 -18.1% 57314514 netperf.workload
75331549 ± 3% +14.8% 86505421 cpuidle..usage
29422 +31.0% 38553 ± 2% meminfo.Shmem
9976 ± 36% +111.0% 21053 ± 30% numa-meminfo.node1.Shmem
77114773 +14.8% 88560654 turbostat.IRQ
1.371e+08 -28.6% 97904821 ± 44% numa-numastat.node0.local_node
1.366e+08 -29.0% 96946927 ± 44% numa-numastat.node0.numa_hit
1754 ± 2% +6848.5% 121876 ± 2% vmstat.system.cs
253706 +14.8% 291171 vmstat.system.in
35.54 ± 3% -7.2% 32.98 ± 3% boot-time.boot
17.61 ± 6% -14.1% 15.13 ± 8% boot-time.dhcp
4043 ± 3% -8.1% 3714 ± 4% boot-time.idle
69337635 -28.5% 49607209 ± 43% numa-vmstat.node0.numa_hit
69744503 -28.2% 50050712 ± 43% numa-vmstat.node0.numa_local
2499 ± 36% +112.0% 5297 ± 29% numa-vmstat.node1.nr_shmem
248090 ± 6% +24.4% 308574 ± 5% perf-stat.i.cache-misses
1683 ± 2% +7192.7% 122793 ± 2% perf-stat.i.context-switches
8145 ± 17% +37.1% 11170 ± 12% perf-stat.i.node-loads
35521 ± 18% +35.9% 48285 ± 18% perf-stat.i.node-stores
0.05 ± 14% +0.0 0.06 ± 8% perf-stat.overall.cache-miss-rate%
49509 ± 15% -20.5% 39345 ± 8% perf-stat.overall.cycles-between-cache-misses
18429 +22.1% 22495 perf-stat.overall.path-length
247210 ± 6% +24.4% 307535 ± 5% perf-stat.ps.cache-misses
1677 ± 2% +7194.0% 122383 ± 2% perf-stat.ps.context-switches
8114 ± 17% +37.2% 11131 ± 12% perf-stat.ps.node-loads
35383 ± 18% +36.0% 48111 ± 18% perf-stat.ps.node-stores
71035 +2.8% 73029 proc-vmstat.nr_inactive_anon
9465 +4.4% 9881 ± 2% proc-vmstat.nr_mapped
7362 +30.3% 9592 ± 3% proc-vmstat.nr_shmem
71035 +2.8% 73029 proc-vmstat.nr_zone_inactive_anon
1.371e+08 -14.3% 1.174e+08 ± 2% proc-vmstat.numa_hit
1.375e+08 -14.1% 1.182e+08 proc-vmstat.numa_local
15448 ±110% +207.4% 47492 ± 28% proc-vmstat.numa_pte_updates
8244 ± 4% +1194.8% 106745 ± 6% proc-vmstat.pgactivate
1.352e+08 -14.5% 1.155e+08 proc-vmstat.pgalloc_normal
1059186 +1.2% 1072281 proc-vmstat.pgfault
1.352e+08 -14.5% 1.156e+08 proc-vmstat.pgfree
24.78 ± 8% -5.4 19.42 ± 15% perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg.__sys_recvfrom
24.81 ± 8% -5.4 19.45 ± 15% perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
23.21 ± 8% -5.0 18.16 ± 15% perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg
22.81 ± 8% -5.0 17.80 ± 15% perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg
22.64 ± 8% -5.0 17.68 ± 15% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
1.63 ± 18% +0.5 2.16 ± 9% perf-profile.calltrace.cycles-pp.ip_rcv.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
1.50 ± 18% +0.5 2.05 ± 9% perf-profile.calltrace.cycles-pp.ip_local_deliver.ip_rcv.__netif_receive_skb_one_core.process_backlog.__napi_poll
1.48 ± 18% +0.6 2.03 ± 9% perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.ip_local_deliver.ip_rcv.__netif_receive_skb_one_core.process_backlog
1.46 ± 18% +0.6 2.02 ± 9% perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.ip_local_deliver.ip_rcv.__netif_receive_skb_one_core
1.37 ± 18% +0.6 1.94 ± 9% perf-profile.calltrace.cycles-pp.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.ip_local_deliver.ip_rcv
1.09 ± 19% +0.6 1.70 ± 10% perf-profile.calltrace.cycles-pp.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.ip_local_deliver
1.03 ± 18% +0.7 1.68 ± 10% perf-profile.calltrace.cycles-pp.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish
0.00 +0.7 0.66 ± 9% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page.skb_release_data.__consume_stateless_skb.udp_recvmsg
0.00 +0.7 0.69 ± 12% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable
0.00 +0.7 0.70 ± 12% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable.__udp_enqueue_schedule_skb
0.00 +0.7 0.74 ± 16% perf-profile.calltrace.cycles-pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_secondary
0.00 +0.8 0.77 ± 16% perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
0.00 +0.8 0.81 ± 13% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb
0.00 +0.8 0.85 ± 12% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb
0.00 +0.9 0.88 ± 16% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp
0.00 +0.9 0.90 ± 15% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
0.00 +0.9 0.94 ± 12% perf-profile.calltrace.cycles-pp.sock_def_readable.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv
0.00 +1.0 0.97 ± 15% perf-profile.calltrace.cycles-pp.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg
0.18 ±141% +1.1 1.28 ± 9% perf-profile.calltrace.cycles-pp.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu
0.00 +1.2 1.24 ± 15% perf-profile.calltrace.cycles-pp.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg.__sys_recvfrom
0.51 ± 45% +1.6 2.09 ± 16% perf-profile.calltrace.cycles-pp.__skb_recv_udp.udp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
24.80 ± 8% -5.4 19.44 ± 15% perf-profile.children.cycles-pp.__skb_datagram_iter
24.81 ± 8% -5.4 19.46 ± 15% perf-profile.children.cycles-pp.skb_copy_datagram_iter
23.23 ± 8% -5.0 18.18 ± 15% perf-profile.children.cycles-pp._copy_to_iter
22.84 ± 8% -5.0 17.82 ± 15% perf-profile.children.cycles-pp.copyout
0.41 ± 16% -0.2 0.22 ± 27% perf-profile.children.cycles-pp.udp_rmem_release
0.52 ± 8% -0.1 0.39 ± 12% perf-profile.children.cycles-pp.free_pcp_prepare
0.18 ± 18% -0.1 0.06 ± 45% perf-profile.children.cycles-pp.free_unref_page_commit
0.08 ± 19% -0.0 0.04 ± 45% perf-profile.children.cycles-pp.kmem_cache_free
0.10 ± 16% +0.0 0.15 ± 8% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.03 ±100% +0.1 0.08 ± 13% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.00 +0.1 0.06 ± 13% perf-profile.children.cycles-pp.ttwu_do_wakeup
0.00 +0.1 0.06 ± 16% perf-profile.children.cycles-pp.__update_load_avg_se
0.00 +0.1 0.06 ± 16% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.00 +0.1 0.06 ± 21% perf-profile.children.cycles-pp.nohz_run_idle_balance
0.00 +0.1 0.07 ± 15% perf-profile.children.cycles-pp.__switch_to_asm
0.00 +0.1 0.07 ± 21% perf-profile.children.cycles-pp.llist_add_batch
0.00 +0.1 0.07 ± 21% perf-profile.children.cycles-pp.__smp_call_single_queue
0.02 ±141% +0.1 0.10 ± 32% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.09 ± 20% +0.1 0.17 ± 7% perf-profile.children.cycles-pp.__list_add_valid
0.00 +0.1 0.08 ± 24% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.00 +0.1 0.08 ± 14% perf-profile.children.cycles-pp.prepare_to_wait_exclusive
0.20 ± 19% +0.1 0.29 ± 19% perf-profile.children.cycles-pp.skb_set_owner_w
0.00 +0.1 0.09 ± 27% perf-profile.children.cycles-pp.flush_smp_call_function_queue
0.08 ± 17% +0.1 0.17 ± 26% perf-profile.children.cycles-pp.__sk_mem_reduce_allocated
0.07 ± 80% +0.1 0.17 ± 27% perf-profile.children.cycles-pp._raw_spin_lock_bh
0.00 +0.1 0.14 ± 12% perf-profile.children.cycles-pp.set_next_entity
0.04 ± 72% +0.1 0.18 ± 15% perf-profile.children.cycles-pp.__zone_watermark_ok
0.00 +0.2 0.16 ± 18% perf-profile.children.cycles-pp.enqueue_entity
0.00 +0.2 0.17 ± 24% perf-profile.children.cycles-pp.sched_ttwu_pending
0.00 +0.2 0.18 ± 10% perf-profile.children.cycles-pp.__switch_to
0.00 +0.2 0.18 ± 14% perf-profile.children.cycles-pp.update_load_avg
0.00 +0.2 0.19 ± 16% perf-profile.children.cycles-pp.ttwu_queue_wakelist
0.00 +0.2 0.20 ± 12% perf-profile.children.cycles-pp.enqueue_task_fair
0.30 ± 7% +0.2 0.51 ± 7% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.00 +0.2 0.21 ± 12% perf-profile.children.cycles-pp.ttwu_do_activate
0.00 +0.2 0.22 ± 14% perf-profile.children.cycles-pp.update_curr
0.00 +0.2 0.23 ± 14% perf-profile.children.cycles-pp.pick_next_task_fair
0.00 +0.3 0.25 ± 26% perf-profile.children.cycles-pp.__sysvec_call_function_single
0.00 +0.3 0.31 ± 25% perf-profile.children.cycles-pp.sysvec_call_function_single
0.36 ± 20% +0.3 0.67 ± 9% perf-profile.children.cycles-pp.free_pcppages_bulk
0.00 +0.4 0.36 ± 27% perf-profile.children.cycles-pp.finish_task_switch
0.00 +0.4 0.38 ± 16% perf-profile.children.cycles-pp.dequeue_entity
0.00 +0.4 0.41 ± 16% perf-profile.children.cycles-pp.dequeue_task_fair
0.00 +0.5 0.47 ± 24% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
3.39 ± 4% +0.5 3.87 ± 6% perf-profile.children.cycles-pp.__softirqentry_text_start
1.63 ± 18% +0.5 2.16 ± 9% perf-profile.children.cycles-pp.ip_rcv
1.50 ± 18% +0.5 2.05 ± 9% perf-profile.children.cycles-pp.ip_local_deliver
1.48 ± 18% +0.6 2.04 ± 9% perf-profile.children.cycles-pp.ip_local_deliver_finish
1.47 ± 18% +0.6 2.02 ± 9% perf-profile.children.cycles-pp.ip_protocol_deliver_rcu
1.38 ± 18% +0.6 1.95 ± 9% perf-profile.children.cycles-pp.__udp4_lib_rcv
1.10 ± 19% +0.6 1.71 ± 10% perf-profile.children.cycles-pp.udp_unicast_rcv_skb
1.04 ± 18% +0.7 1.69 ± 10% perf-profile.children.cycles-pp.udp_queue_rcv_one_skb
0.00 +0.7 0.71 ± 12% perf-profile.children.cycles-pp.autoremove_wake_function
0.00 +0.7 0.71 ± 12% perf-profile.children.cycles-pp.try_to_wake_up
0.00 +0.8 0.78 ± 16% perf-profile.children.cycles-pp.schedule_idle
0.00 +0.8 0.82 ± 13% perf-profile.children.cycles-pp.__wake_up_common
0.46 ± 15% +0.8 1.29 ± 9% perf-profile.children.cycles-pp.__udp_enqueue_schedule_skb
0.00 +0.9 0.86 ± 56% perf-profile.children.cycles-pp.poll_idle
0.00 +0.9 0.86 ± 12% perf-profile.children.cycles-pp.__wake_up_common_lock
0.03 ±100% +0.9 0.95 ± 12% perf-profile.children.cycles-pp.sock_def_readable
0.02 ±142% +0.9 0.94 ± 14% perf-profile.children.cycles-pp.schedule
0.00 +1.0 0.98 ± 14% perf-profile.children.cycles-pp.schedule_timeout
0.00 +1.2 1.24 ± 15% perf-profile.children.cycles-pp.__skb_wait_for_more_packets
0.59 ± 13% +1.5 2.11 ± 16% perf-profile.children.cycles-pp.__skb_recv_udp
0.07 ± 16% +1.6 1.68 ± 14% perf-profile.children.cycles-pp.__schedule
0.23 ± 18% -0.2 0.03 ±103% perf-profile.self.cycles-pp.udp_rmem_release
0.52 ± 9% -0.1 0.38 ± 12% perf-profile.self.cycles-pp.free_pcp_prepare
0.15 ± 15% -0.1 0.04 ± 71% perf-profile.self.cycles-pp.free_unref_page_commit
0.29 ± 11% -0.1 0.20 ± 19% perf-profile.self.cycles-pp.__skb_datagram_iter
0.25 ± 13% -0.1 0.16 ± 17% perf-profile.self.cycles-pp.udp_recvmsg
0.14 ± 16% -0.0 0.10 ± 10% perf-profile.self.cycles-pp.__alloc_pages
0.08 ± 20% -0.0 0.03 ± 70% perf-profile.self.cycles-pp.kmem_cache_free
0.00 +0.1 0.06 ± 13% perf-profile.self.cycles-pp.__update_load_avg_se
0.00 +0.1 0.07 ± 15% perf-profile.self.cycles-pp.__switch_to_asm
0.00 +0.1 0.07 ± 23% perf-profile.self.cycles-pp.llist_add_batch
0.02 ±142% +0.1 0.09 ± 23% perf-profile.self.cycles-pp.sock_def_readable
0.00 +0.1 0.07 ± 11% perf-profile.self.cycles-pp.schedule_timeout
0.00 +0.1 0.08 ± 20% perf-profile.self.cycles-pp.enqueue_entity
0.00 +0.1 0.08 ± 26% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.08 ± 21% +0.1 0.16 ± 5% perf-profile.self.cycles-pp.__list_add_valid
0.02 ±141% +0.1 0.10 ± 32% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.20 ± 17% +0.1 0.29 ± 19% perf-profile.self.cycles-pp.skb_set_owner_w
0.00 +0.1 0.09 ± 17% perf-profile.self.cycles-pp.set_next_entity
0.00 +0.1 0.09 ± 16% perf-profile.self.cycles-pp.update_curr
0.17 ± 19% +0.1 0.27 ± 11% perf-profile.self.cycles-pp.skb_page_frag_refill
0.07 ± 12% +0.1 0.17 ± 25% perf-profile.self.cycles-pp.__sk_mem_reduce_allocated
0.06 ± 79% +0.1 0.16 ± 28% perf-profile.self.cycles-pp._raw_spin_lock_bh
0.00 +0.1 0.11 ± 19% perf-profile.self.cycles-pp.__wake_up_common
0.00 +0.1 0.13 ± 24% perf-profile.self.cycles-pp.try_to_wake_up
0.00 +0.1 0.13 ± 21% perf-profile.self.cycles-pp.__skb_wait_for_more_packets
0.05 ± 75% +0.1 0.18 ± 6% perf-profile.self.cycles-pp.update_rq_clock
0.00 +0.2 0.15 ± 27% perf-profile.self.cycles-pp.finish_task_switch
0.17 ± 18% +0.2 0.32 ± 12% perf-profile.self.cycles-pp.skb_release_data
0.02 ±141% +0.2 0.18 ± 18% perf-profile.self.cycles-pp.__zone_watermark_ok
0.00 +0.2 0.17 ± 12% perf-profile.self.cycles-pp.__switch_to
0.28 ± 6% +0.2 0.47 ± 6% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.00 +0.3 0.30 ± 18% perf-profile.self.cycles-pp.__schedule
0.04 ± 72% +0.4 0.40 ± 21% perf-profile.self.cycles-pp.__skb_recv_udp
0.00 +0.7 0.69 ± 70% perf-profile.self.cycles-pp.poll_idle
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.17.0-rc5-00006-g39907a939a34" of type "text/plain" (162108 bytes)
View attachment "job-script" of type "text/plain" (8312 bytes)
View attachment "job.yaml" of type "text/plain" (5668 bytes)
View attachment "reproduce" of type "text/plain" (329 bytes)
Powered by blists - more mailing lists