[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230615073153.GA110814@ziqianlu-dell>
Date: Thu, 15 Jun 2023 15:31:53 +0800
From: Aaron Lu <aaron.lu@...el.com>
To: David Vernet <void@...ifault.com>
CC: Peter Zijlstra <peterz@...radead.org>,
<linux-kernel@...r.kernel.org>, <mingo@...hat.com>,
<juri.lelli@...hat.com>, <vincent.guittot@...aro.org>,
<rostedt@...dmis.org>, <dietmar.eggemann@....com>,
<bsegall@...gle.com>, <mgorman@...e.de>, <bristot@...hat.com>,
<vschneid@...hat.com>, <joshdon@...gle.com>,
<roman.gushchin@...ux.dev>, <tj@...nel.org>, <kernel-team@...a.com>
Subject: Re: [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS
On Thu, Jun 15, 2023 at 12:49:17PM +0800, Aaron Lu wrote:
> I'll see if I can find a smaller machine and give it a run there too.
Found a Skylake with 18cores/36threads on each socket/LLC and with
netperf, the contention is still serious.
"
$ netserver
$ sudo sh -c "echo SWQUEUE > /sys/kernel/debug/sched/features"
$ for i in `seq 72`; do netperf -l 60 -n 72 -6 -t UDP_RR & done
"
53.61% 53.61% [kernel.vmlinux] [k] native_queued_spin_lock_slowpath - -
|
|--27.93%--sendto
| entry_SYSCALL_64
| do_syscall_64
| |
| --27.93%--__x64_sys_sendto
| __sys_sendto
| sock_sendmsg
| inet6_sendmsg
| udpv6_sendmsg
| udp_v6_send_skb
| ip6_send_skb
| ip6_local_out
| ip6_output
| ip6_finish_output
| ip6_finish_output2
| __dev_queue_xmit
| __local_bh_enable_ip
| do_softirq.part.0
| __do_softirq
| net_rx_action
| __napi_poll
| process_backlog
| __netif_receive_skb
| __netif_receive_skb_one_core
| ipv6_rcv
| ip6_input
| ip6_input_finish
| ip6_protocol_deliver_rcu
| udpv6_rcv
| __udp6_lib_rcv
| udp6_unicast_rcv_skb
| udpv6_queue_rcv_skb
| udpv6_queue_rcv_one_skb
| __udp_enqueue_schedule_skb
| sock_def_readable
| __wake_up_sync_key
| __wake_up_common_lock
| |
| --27.85%--__wake_up_common
| receiver_wake_function
| autoremove_wake_function
| default_wake_function
| try_to_wake_up
| |
| --27.85%--ttwu_do_activate
| enqueue_task
| enqueue_task_fair
| |
| --27.85%--_raw_spin_lock_irqsave
| |
| --27.85%--native_queued_spin_lock_slowpath
|
--25.67%--recvfrom
entry_SYSCALL_64
do_syscall_64
__x64_sys_recvfrom
__sys_recvfrom
sock_recvmsg
inet6_recvmsg
udpv6_recvmsg
__skb_recv_udp
|
--25.67%--__skb_wait_for_more_packets
schedule_timeout
schedule
__schedule
|
--25.66%--pick_next_task_fair
|
--25.65%--swqueue_remove_task
|
--25.65%--_raw_spin_lock_irqsave
|
--25.65%--native_queued_spin_lock_slowpath
I didn't aggregate the throughput(Trans. Rate per sec) from all these
clients, but a glimpse from the result showed that the throughput of
each client dropped from 4xxxx(NO_SWQUEUE) to 2xxxx(SWQUEUE).
Thanks,
Aaron
Powered by blists - more mailing lists