[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fa165c95-6a1f-c50e-cfa5-30fda02ca9d6@itcare.pl>
Date: Mon, 12 Nov 2018 20:19:01 +0100
From: Paweł Staszewski <pstaszewski@...are.pl>
To: Jesper Dangaard Brouer <brouer@...hat.com>
Cc: Saeed Mahameed <saeedm@...lanox.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Kernel 4.19 network performance - forwarding/routing normal users
traffic
W dniu 11.11.2018 o 09:56, Jesper Dangaard Brouer pisze:
> On Sat, 10 Nov 2018 22:53:53 +0100 Paweł Staszewski <pstaszewski@...are.pl> wrote:
>
>> Now im messing with ring configuration for connectx5 nics.
>> And after reading that paper:
>> https://netdevconf.org/2.1/slides/apr6/network-performance/04-amir-RX_and_TX_bulking_v2.pdf
>>
> Do notice that some of the ideas in that slide deck, was never
> implemented. But they are still on my todo list ;-).
>
> Notice how that it show that TX bulking is very important, but based on
> your ethtool_stats.pl, I can see that not much TX bulking is happening
> in your case. This is indicated via the xmit_more counters.
>
> Ethtool(enp175s0) stat: 2630 ( 2,630) <= tx_xmit_more /sec
> Ethtool(enp175s0) stat: 4956995 ( 4,956,995) <= tx_packets /sec
>
> And the per queue levels are also avail:
>
> Ethtool(enp175s0) stat: 184845 ( 184,845) <= tx7_packets /sec
> Ethtool(enp175s0) stat: 78 ( 78) <= tx7_xmit_more /sec
>
> This means that you are doing too many doorbell's to the NIC hardware
> at TX time, which I worry could be what cause the NIC and PCIe hardware
> not to operate at optimal speeds.
After tunning coal/ring a little with ethtool
Reached today:
bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
input: /proc/net/dev type: rate
| iface Rx Tx Total
==============================================================================
enp175s0: 50.68 Gb/s 21.53 Gb/s
72.20 Gb/s
enp216s0: 21.62 Gb/s 50.81 Gb/s
72.42 Gb/s
------------------------------------------------------------------------------
total: 72.30 Gb/s 72.33 Gb/s
144.63 Gb/s
And still no packet loss (icmp side to side test every 100ms)
Below perf top
PerfTop: 104692 irqs/sec kernel:99.5% exact: 0.0% [4000Hz
cycles], (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
9.06% [kernel] [k] mlx5e_skb_from_cqe_mpwrq_linear
6.43% [kernel] [k] tasklet_action_common.isra.21
5.68% [kernel] [k] fib_table_lookup
4.89% [kernel] [k] irq_entries_start
4.53% [kernel] [k] mlx5_eq_int
4.10% [kernel] [k] build_skb
3.39% [kernel] [k] mlx5e_poll_tx_cq
3.38% [kernel] [k] mlx5e_sq_xmit
2.73% [kernel] [k] mlx5e_poll_rx_cq
2.18% [kernel] [k] __dev_queue_xmit
2.13% [kernel] [k] vlan_do_receive
2.12% [kernel] [k] mlx5e_handle_rx_cqe_mpwrq
2.00% [kernel] [k] ip_finish_output2
1.87% [kernel] [k] mlx5e_post_rx_mpwqes
1.86% [kernel] [k] memcpy_erms
1.85% [kernel] [k] ipt_do_table
1.70% [kernel] [k] dev_gro_receive
1.39% [kernel] [k] __netif_receive_skb_core
1.31% [kernel] [k] inet_gro_receive
1.21% [kernel] [k] ip_route_input_rcu
1.21% [kernel] [k] tcp_gro_receive
1.13% [kernel] [k] _raw_spin_lock
1.08% [kernel] [k] __build_skb
1.06% [kernel] [k] kmem_cache_free_bulk
1.05% [kernel] [k] __softirqentry_text_start
1.03% [kernel] [k] vlan_dev_hard_start_xmit
0.98% [kernel] [k] pfifo_fast_dequeue
0.95% [kernel] [k] mlx5e_xmit
0.95% [kernel] [k] page_frag_free
0.88% [kernel] [k] ip_forward
0.81% [kernel] [k] dev_hard_start_xmit
0.78% [kernel] [k] rcu_irq_exit
0.77% [kernel] [k] netif_skb_features
0.72% [kernel] [k] napi_complete_done
0.72% [kernel] [k] kmem_cache_alloc
0.68% [kernel] [k] validate_xmit_skb.isra.142
0.66% [kernel] [k] ip_rcv_core.isra.20.constprop.25
0.58% [kernel] [k] swiotlb_map_page
0.57% [kernel] [k] __qdisc_run
0.56% [kernel] [k] tasklet_action
0.54% [kernel] [k] __get_xps_queue_idx
0.54% [kernel] [k] inet_lookup_ifaddr_rcu
0.50% [kernel] [k] tcp4_gro_receive
0.49% [kernel] [k] skb_release_data
0.47% [kernel] [k] eth_type_trans
0.40% [kernel] [k] sch_direct_xmit
0.40% [kernel] [k] net_rx_action
0.39% [kernel] [k] __local_bh_enable_ip
And perf record/report
https://ufile.io/zguq0
So now i know what was causing cpu load for some processes like:
2913 root 20 0 0 0 0 I 10.3 0.0 6:58.29
kworker/u112:1-
7 root 20 0 0 0 0 I 8.6 0.0 6:17.18
kworker/u112:0-
10289 root 20 0 0 0 0 I 6.6 0.0 6:33.90
kworker/u112:4-
2939 root 20 0 0 0 0 R 3.6 0.0 7:37.68
kworker/u112:2-
After disabling adaptative tx for coalescense - all this processes gone.
lavg drops from 40 to 1
Current settings for coalescence:
ethtool -c enp175s0
Coalesce parameters for enp175s0:
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0
dmac: 32548
rx-usecs: 24
rx-frames: 256
rx-usecs-irq: 0
rx-frames-irq: 0
tx-usecs: 0
tx-frames: 64
tx-usecs-irq: 0
tx-frames-irq: 0
rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0
rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0
And currently with that traffiv lvls - have no packet loss (cpu is avg.
60% for all 28 cores)
Powered by blists - more mailing lists