netdev - Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e6ece370-bf62-5d36-0417-779d2345fc8d@itcare.pl>
Date:   Fri, 9 Nov 2018 23:20:38 +0100
From:   Paweł Staszewski <pstaszewski@...are.pl>
To:     Saeed Mahameed <saeedm@...lanox.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Jesper Dangaard Brouer <brouer@...hat.com>
Subject: Re: Kernel 4.19 network performance - forwarding/routing normal users
 traffic



W dniu 08.11.2018 o 20:12, Paweł Staszewski pisze:
> CPU load is lower than for connectx4 - but it looks like bandwidth 
> limit is the same :)
> But also after reaching 60Gbit/60Gbit
>
>  bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
>   input: /proc/net/dev type: rate
>   -         iface                   Rx Tx                Total
> ============================================================================== 
>
>          enp175s0:          45.09 Gb/s           15.09 Gb/s           
> 60.18 Gb/s
>          enp216s0:          15.14 Gb/s           45.19 Gb/s           
> 60.33 Gb/s
> ------------------------------------------------------------------------------ 
>
>             total:          60.45 Gb/s           60.48 Gb/s 120.93 Gb/s 

Today reached 65/65Gbit/s

But starting from 60Gbit/s RX / 60Gbit TX nics start to drop packets 
(with 50%CPU on all 28cores) - so still there is cpu power to use :).

So checked other stats.
softnet_stats shows average 1k squeezed per sec:
cpu      total    dropped   squeezed  collision        rps flow_limit
   0      18554          0          1          0          0 0
   1      16728          0          1          0          0 0
   2      18033          0          1          0          0 0
   3      17757          0          1          0          0 0
   4      18861          0          0          0          0 0
   5          0          0          1          0          0 0
   6          2          0          1          0          0 0
   7          0          0          1          0          0 0
   8          0          0          0          0          0 0
   9          0          0          1          0          0 0
  10          0          0          0          0          0 0
  11          0          0          1          0          0 0
  12         50          0          1          0          0 0
  13        257          0          0          0          0 0
  14 3629115363          0    3353259          0          0 0
  15  255167835          0    3138271          0          0 0
  16 4240101961          0    3036130          0          0 0
  17  599810018          0    3072169          0          0 0
  18  432796524          0    3034191          0          0 0
  19   41803906          0    3037405          0          0 0
  20  900382666          0    3112294          0          0 0
  21  620926085          0    3086009          0          0 0
  22   41861198          0    3023142          0          0 0
  23 4090425574          0    2990412          0          0 0
  24 4264870218          0    3010272          0          0 0
  25  141401811          0    3027153          0          0 0
  26  104155188          0    3051251          0          0 0
  27 4261258691          0    3039765          0          0 0
  28          4          0          1          0          0 0
  29          4          0          0          0          0 0
  30          0          0          1          0          0 0
  31          0          0          0          0          0 0
  32          3          0          1          0          0 0
  33          1          0          1          0          0 0
  34          0          0          1          0          0 0
  35          0          0          0          0          0 0
  36          0          0          1          0          0 0
  37          0          0          1          0          0 0
  38          0          0          1          0          0 0
  39          0          0          1          0          0 0
  40          0          0          0          0          0 0
  41          0          0          1          0          0 0
  42  299758202          0    3139693          0          0 0
  43 4254727979          0    3103577          0          0 0
  44 1959555543          0    2554885          0          0 0
  45 1675702723          0    2513481          0          0 0
  46 1908435503          0    2519698          0          0 0
  47 1877799710          0    2537768          0          0 0
  48 2384274076          0    2584673          0          0 0
  49 2598104878          0    2593616          0          0 0
  50 1897566829          0    2530857          0          0 0
  51 1712741629          0    2489089          0          0 0
  52 1704033648          0    2495892          0          0 0
  53 1636781820          0    2499783          0          0 0
  54 1861997734          0    2541060          0          0 0
  55 2113521616          0    2555673          0          0 0


So i rised netdev backlog and budged to rly high values
524288 for netdev_budget and same for backlog

This rised sortirqs from about 600k/sec to 800k/sec for NET_TX/NET_RX

But after this changes i have less packets drops.


Below perf top from max traffic reached:
    PerfTop:   72230 irqs/sec  kernel:99.4%  exact:  0.0% [4000Hz 
cycles],  (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     12.62%  [kernel]       [k] mlx5e_skb_from_cqe_mpwrq_linear
      8.44%  [kernel]       [k] mlx5e_sq_xmit
      6.69%  [kernel]       [k] build_skb
      5.21%  [kernel]       [k] fib_table_lookup
      3.54%  [kernel]       [k] memcpy_erms
      3.20%  [kernel]       [k] mlx5e_poll_rx_cq
      2.25%  [kernel]       [k] vlan_do_receive
      2.20%  [kernel]       [k] mlx5e_post_rx_mpwqes
      2.02%  [kernel]       [k] mlx5e_handle_rx_cqe_mpwrq
      1.95%  [kernel]       [k] __dev_queue_xmit
      1.83%  [kernel]       [k] dev_gro_receive
      1.79%  [kernel]       [k] tcp_gro_receive
      1.73%  [kernel]       [k] ip_finish_output2
      1.63%  [kernel]       [k] mlx5e_poll_tx_cq
      1.49%  [kernel]       [k] ipt_do_table
      1.38%  [kernel]       [k] inet_gro_receive
      1.31%  [kernel]       [k] __netif_receive_skb_core
      1.30%  [kernel]       [k] _raw_spin_lock
      1.28%  [kernel]       [k] mlx5_eq_int
      1.24%  [kernel]       [k] irq_entries_start
      1.19%  [kernel]       [k] __build_skb
      1.15%  [kernel]       [k] swiotlb_map_page
      1.02%  [kernel]       [k] vlan_dev_hard_start_xmit
      0.94%  [kernel]       [k] pfifo_fast_dequeue
      0.92%  [kernel]       [k] ip_route_input_rcu
      0.86%  [kernel]       [k] kmem_cache_alloc
      0.80%  [kernel]       [k] mlx5e_xmit
      0.79%  [kernel]       [k] dev_hard_start_xmit
      0.78%  [kernel]       [k] _raw_spin_lock_irqsave
      0.74%  [kernel]       [k] ip_forward
      0.72%  [kernel]       [k] tasklet_action_common.isra.21
      0.68%  [kernel]       [k] pfifo_fast_enqueue
      0.67%  [kernel]       [k] netif_skb_features
      0.66%  [kernel]       [k] skb_segment
      0.60%  [kernel]       [k] skb_gro_receive
      0.56%  [kernel]       [k] validate_xmit_skb.isra.142
      0.53%  [kernel]       [k] skb_release_data
      0.51%  [kernel]       [k] mlx5e_page_release
      0.51%  [kernel]       [k] ip_rcv_core.isra.20.constprop.25
      0.51%  [kernel]       [k] __qdisc_run
      0.50%  [kernel]       [k] tcp4_gro_receive
      0.49%  [kernel]       [k] page_frag_free
      0.46%  [kernel]       [k] kmem_cache_free_bulk
      0.43%  [kernel]       [k] kmem_cache_free
      0.42%  [kernel]       [k] try_to_wake_up
      0.39%  [kernel]       [k] _raw_spin_lock_irq
      0.39%  [kernel]       [k] find_busiest_group
      0.37%  [kernel]       [k] __memcpy



Remember those tests are now on two separate connectx5 connected to two 
separate pcie x16  gen 3.0