[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8e17a844-e98b-59b1-5a0e-669562b3178c@itcare.pl>
Date: Mon, 2 Dec 2019 11:09:29 +0100
From: Paweł Staszewski <pstaszewski@...are.pl>
To: David Ahern <dsahern@...il.com>, netdev@...r.kernel.org,
Jesper Dangaard Brouer <brouer@...hat.com>
Subject: Re: Linux kernel - 5.4.0+ (net-next from 27.11.2019) routing/network
performance
W dniu 01.12.2019 o 17:05, David Ahern pisze:
> On 11/29/19 4:00 PM, Paweł Staszewski wrote:
>> As always - each year i need to summarize network performance for
>> routing applications like linux router on native Linux kernel (without
>> xdp/dpdk/vpp etc) :)
>>
> Do you keep past profiles? How does this profile (and traffic rates)
> compare to older kernels - e.g., 5.0 or 4.19?
>
>
Yes - so for 4.19:
Max bandwidth was about 40-42Gbit/s RX / 40-42Gbit/s TX of
forwarded(routed) traffic
And after "order-0 pages" patches - max was 50Gbit/s RX + 50Gbit/s TX
(forwarding - bandwidth max)
(current kernel almost doubled this)
And also old perf top (from kernel 4.19) - before "order-0 pages patch":
PerfTop: 108490 irqs/sec kernel:99.6% exact: 0.0% [4000Hz
cycles], (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
26.78% [kernel] [k] queued_spin_lock_slowpath
9.09% [kernel] [k] mlx5e_skb_from_cqe_linear
4.94% [kernel] [k] mlx5e_sq_xmit
3.63% [kernel] [k] memcpy_erms
3.30% [kernel] [k] fib_table_lookup
3.26% [kernel] [k] build_skb
2.41% [kernel] [k] mlx5e_poll_tx_cq
2.11% [kernel] [k] get_page_from_freelist
1.51% [kernel] [k] vlan_do_receive
1.51% [kernel] [k] _raw_spin_lock
1.43% [kernel] [k] __dev_queue_xmit
1.41% [kernel] [k] dev_gro_receive
1.34% [kernel] [k] mlx5e_poll_rx_cq
1.26% [kernel] [k] tcp_gro_receive
1.21% [kernel] [k] free_one_page
1.13% [kernel] [k] swiotlb_map_page
1.13% [kernel] [k] mlx5e_post_rx_wqes
1.05% [kernel] [k] pfifo_fast_dequeue
1.05% [kernel] [k] mlx5e_handle_rx_cqe
1.03% [kernel] [k] ip_finish_output2
1.02% [kernel] [k] ipt_do_table
0.96% [kernel] [k] inet_gro_receive
0.91% [kernel] [k] mlx5_eq_int
0.88% [kernel] [k] __slab_free.isra.79
0.86% [kernel] [k] __build_skb
0.84% [kernel] [k] page_frag_free
0.76% [kernel] [k] skb_release_data
0.75% [kernel] [k] __netif_receive_skb_core
0.75% [kernel] [k] irq_entries_start
0.71% [kernel] [k] ip_route_input_rcu
0.65% [kernel] [k] vlan_dev_hard_start_xmit
0.56% [kernel] [k] ip_forward
0.56% [kernel] [k] __memcpy
0.52% [kernel] [k] kmem_cache_alloc
0.52% [kernel] [k] kmem_cache_free_bulk
0.49% [kernel] [k] mlx5e_page_release
0.47% [kernel] [k] netif_skb_features
0.47% [kernel] [k] mlx5e_build_rx_skb
0.47% [kernel] [k] dev_hard_start_xmit
0.43% [kernel] [k] __page_pool_put_page
0.43% [kernel] [k] __netif_schedule
0.43% [kernel] [k] mlx5e_xmit
0.41% [kernel] [k] __qdisc_run
0.41% [kernel] [k] validate_xmit_skb.isra.142
0.41% [kernel] [k] swiotlb_unmap_page
0.40% [kernel] [k] inet_lookup_ifaddr_rcu
0.34% [kernel] [k] ip_rcv_core.isra.20.constprop.25
0.34% [kernel] [k] tcp4_gro_receive
0.29% [kernel] [k] _raw_spin_lock_irqsave
0.29% [kernel] [k] napi_consume_skb
0.29% [kernel] [k] skb_gro_receive
0.29% [kernel] [k] ___slab_alloc.isra.80
0.27% [kernel] [k] eth_type_trans
0.26% [kernel] [k] __free_pages_ok
0.26% [kernel] [k] __get_xps_queue_idx
0.24% [kernel] [k] _raw_spin_trylock
0.23% [kernel] [k] __local_bh_enable_ip
0.22% [kernel] [k] pfifo_fast_enqueue
0.21% [kernel] [k] tasklet_action_common.isra.21
0.21% [kernel] [k] sch_direct_xmit
0.21% [kernel] [k] skb_network_protocol
0.21% [kernel] [k] kmem_cache_free
0.20% [kernel] [k] netdev_pick_tx
0.18% [kernel] [k] napi_gro_complete
0.18% [kernel] [k] __sched_text_start
0.18% [kernel] [k] mlx5e_xdp_handle
0.17% [kernel] [k] ip_finish_output
0.16% [kernel] [k] napi_gro_flush
0.16% [kernel] [k] vlan_passthru_hard_header
0.16% [kernel] [k] skb_segment
0.15% [kernel] [k] __alloc_pages_nodemask
0.15% [kernel] [k] mlx5e_features_check
0.15% [kernel] [k] mlx5e_napi_poll
0.15% [kernel] [k] napi_gro_receive
0.14% [kernel] [k] fib_validate_source
0.14% [kernel] [k] _raw_spin_lock_irq
0.14% [kernel] [k] inet_gro_complete
0.14% [kernel] [k] get_partial_node.isra.78
0.13% [kernel] [k] napi_complete_done
0.13% [kernel] [k] ip_rcv_finish_core.isra.17
0.13% [kernel] [k] cmd_exec
After "order-0 pages" patch
PerfTop: 104692 irqs/sec kernel:99.5% exact: 0.0% [4000Hz
cycles], (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
9.06% [kernel] [k] mlx5e_skb_from_cqe_mpwrq_linear
6.43% [kernel] [k] tasklet_action_common.isra.21
5.68% [kernel] [k] fib_table_lookup
4.89% [kernel] [k] irq_entries_start
4.53% [kernel] [k] mlx5_eq_int
4.10% [kernel] [k] build_skb
3.39% [kernel] [k] mlx5e_poll_tx_cq
3.38% [kernel] [k] mlx5e_sq_xmit
2.73% [kernel] [k] mlx5e_poll_rx_cq
2.18% [kernel] [k] __dev_queue_xmit
2.13% [kernel] [k] vlan_do_receive
2.12% [kernel] [k] mlx5e_handle_rx_cqe_mpwrq
2.00% [kernel] [k] ip_finish_output2
1.87% [kernel] [k] mlx5e_post_rx_mpwqes
1.86% [kernel] [k] memcpy_erms
1.85% [kernel] [k] ipt_do_table
1.70% [kernel] [k] dev_gro_receive
1.39% [kernel] [k] __netif_receive_skb_core
1.31% [kernel] [k] inet_gro_receive
1.21% [kernel] [k] ip_route_input_rcu
1.21% [kernel] [k] tcp_gro_receive
1.13% [kernel] [k] _raw_spin_lock
1.08% [kernel] [k] __build_skb
1.06% [kernel] [k] kmem_cache_free_bulk
1.05% [kernel] [k] __softirqentry_text_start
1.03% [kernel] [k] vlan_dev_hard_start_xmit
0.98% [kernel] [k] pfifo_fast_dequeue
0.95% [kernel] [k] mlx5e_xmit
0.95% [kernel] [k] page_frag_free
0.88% [kernel] [k] ip_forward
0.81% [kernel] [k] dev_hard_start_xmit
0.78% [kernel] [k] rcu_irq_exit
0.77% [kernel] [k] netif_skb_features
0.72% [kernel] [k] napi_complete_done
0.72% [kernel] [k] kmem_cache_alloc
0.68% [kernel] [k] validate_xmit_skb.isra.142
0.66% [kernel] [k] ip_rcv_core.isra.20.constprop.25
0.58% [kernel] [k] swiotlb_map_page
0.57% [kernel] [k] __qdisc_run
0.56% [kernel] [k] tasklet_action
0.54% [kernel] [k] __get_xps_queue_idx
0.54% [kernel] [k] inet_lookup_ifaddr_rcu
0.50% [kernel] [k] tcp4_gro_receive
0.49% [kernel] [k] skb_release_data
0.47% [kernel] [k] eth_type_trans
0.40% [kernel] [k] sch_direct_xmit
0.40% [kernel] [k] net_rx_action
0.39% [kernel] [k] __local_bh_enable_ip
>> HW setup:
>>
>> Server (Supermicro SYS-1019P-WTR)
>>
>> 1x Intel 6146
>>
>> 2x Mellanox connect-x 5 (100G) (installed in two different x16 pcie
>> gen3.1 slots)
>>
>> 6x 8GB DDR4 2666 (it really matters cause 100G is about 12.5GB/s of
>> memory bandwidth one direction)
>>
>>
>> And here it is:
>>
>> perf top at 72Gbit.s RX and 72Gbit/s TX (at same time)
>>
>> PerfTop: 91202 irqs/sec kernel:99.7% exact: 100.0% [4000Hz
>> cycles:ppp], (all, 24 CPUs)
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>> 7.56% [kernel] [k] __dev_queue_xmit
>> 5.27% [kernel] [k] build_skb
>> 4.41% [kernel] [k] rr_transmit
>> 4.17% [kernel] [k] fib_table_lookup
>> 3.83% [kernel] [k] mlx5e_skb_from_cqe_mpwrq_linear
>> 3.30% [kernel] [k] mlx5e_sq_xmit
>> 3.14% [kernel] [k] __netif_receive_skb_core
>> 2.48% [kernel] [k] netif_skb_features
>> 2.36% [kernel] [k] _raw_spin_trylock
>> 2.27% [kernel] [k] dev_hard_start_xmit
--
Paweł Staszewski
Powered by blists - more mailing lists