lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 30 Mar 2018 17:02:37 -0700
From:   Saeed Mahameed <saeedm@...lanox.com>
To:     "David S. Miller" <davem@...emloft.net>
Cc:     netdev@...r.kernel.org, Tariq Toukan <tariqt@...lanox.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Saeed Mahameed <saeedm@...lanox.com>
Subject: [pull request][net-next 00/15] Mellanox, mlx5 updates 2018-03-30

Hi Dave,

This series contains updates to mlx5 core and mlx5e netdev drivers.
The main highlight of this series is the RX optimizations for striding RQ path,
introduced by Tariq.

For more information please see tag log below.

Please pull and let me know if there's any problem.

Thanks,
Saeed.

---

The following changes since commit c0b6edef0bf0e33c12eaf80c676ff09def011518:

  tc-testing: Add newline when writing test case files (2018-03-30 14:22:51 -0400)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2018-03-30

for you to fetch changes up to ab966d7e4ff988a48b3ad72e7abf903aa840afd1:

  net/mlx5e: RX, Recycle buffer of UMR WQEs (2018-03-30 16:55:07 -0700)

----------------------------------------------------------------
mlx5-updates-2018-03-30

This series contains updates to mlx5 core and mlx5e netdev drivers.
The main highlight of this series is the RX optimizations for striding RQ path,
introduced by Tariq.

First Four patches are trivial misc cleanups.
 - Spelling mistake fix
 - Dead code removal
 - Warning messages

RX optimizations for striding RQ:

1) RX refactoring, cleanups and micro optimizations
   - MTU calculation simplifications, obsoletes some WQEs-to-packets translation
     functions and helps delete ~60 LOC.
   - Do not busy-wait a pending UMR completion.
   - post the new values of UMR WQE inline, instead of using a data pointer.
   - use pre-initialized structures to save calculations in datapath.

2) Use linear SKB in Striding RQ "build_skb", (Using linear SKB has many advantages):
    - Saves a memcpy of the headers.
    - No page-boundary checks in datapath.
    - No filler CQEs.
    - Significantly smaller CQ.
    - SKB data continuously resides in linear part, and not split to
      small amount (linear part) and large amount (fragment).
      This saves datapath cycles in driver and improves utilization
      of SKB fragments in GRO.
    - The fragments of a resulting GRO SKB follow the IP forwarding
      assumption of equal-size fragments.

    implementation details:
    HW writes the packets to the beginning of a stride,
    i.e. does not keep headroom. To overcome this we make sure we can
    extend backwards and use the last bytes of stride i-1.
    Extra care is needed for stride 0 as it has no preceding stride.
    We make sure headroom bytes are available by shifting the buffer
    pointer passed to HW by headroom bytes.

    This configuration now becomes default, whenever capable.
    Of course, this implies turning LRO off.

    Performance testing:
    ConnectX-5, single core, single RX ring, default MTU.

    UDP packet rate, early drop in TC layer:

    --------------------------------------------
    | pkt size | before    | after     | ratio |
    --------------------------------------------
    | 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
    |  500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
    |   64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
    --------------------------------------------

    TCP streams: ~20% gain

3) Support XDP over Striding RQ:
    Now that linear SKB is supported over Striding RQ,
    we can support XDP by setting stride size to PAGE_SIZE
    and headroom to XDP_PACKET_HEADROOM.

    Striding RQ is capable of a higher packet-rate than
    conventional RQ.

    Performance testing:
    ConnectX-5, 24 rings, default MTU.
    CQE compression ON (to reduce completions BW in PCI).

    XDP_DROP packet rate:
    --------------------------------------------------
    | pkt size | XDP rate   | 100GbE linerate | pct% |
    --------------------------------------------------
    |   64byte | 126.2 Mpps |      148.0 Mpps |  85% |
    |  128byte |  80.0 Mpps |       84.8 Mpps |  94% |
    |  256byte |  42.7 Mpps |       42.7 Mpps | 100% |
    |  512byte |  23.4 Mpps |       23.4 Mpps | 100% |
    --------------------------------------------------

4) Remove mlx5 page_ref bulking in Striding RQ and use page_ref_inc only when needed.
   Without this bulking, we have:
    - no atomic ops on WQE allocation or free
    - one atomic op per SKB
    - In the default MTU configuration (1500, stride size is 2K),
      the non-bulking method execute 2 atomic ops as before
    - For larger MTUs with stride size of 4K, non-bulking method
      executes only a single op.
    - For XDP (stride size of 4K, no SKBs), non-bulking have no atomic ops per packet at all.

    Performance testing:
    ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.

    Single core packet rate (64 bytes).

    Early drop in TC: no degradation.

    XDP_DROP:
    before: 14,270,188 pps
    after:  20,503,603 pps, 43% improvement.

Thanks,
saeed.

----------------------------------------------------------------
Alaa Hleihel (1):
      net/mlx5: Change teardown with force mode failure message to warning

Saeed Mahameed (2):
      net/mlx5e: Use eq ptr from cq
      net/mlx5: Eliminate query xsrq dead code

Talat Batheesh (1):
      net/mlx5e: IPoIB, Fix spelling mistake

Tariq Toukan (11):
      net/mlx5e: Save MTU in channels params
      net/mlx5e: Derive Striding RQ size from MTU
      net/mlx5e: Code movements in RX UMR WQE post
      net/mlx5e: Do not busy-wait for UMR completion in Striding RQ
      net/mlx5e: Use inline MTTs in UMR WQEs
      net/mlx5e: Use linear SKB in Striding RQ
      net/mlx5e: Refactor RQ XDP_TX indication
      net/mlx5e: Support XDP over Striding RQ
      net/mlx5e: Remove page_ref bulking in Striding RQ
      net/mlx5e: Keep single pre-initialized UMR WQE per RQ
      net/mlx5e: RX, Recycle buffer of UMR WQEs

 drivers/net/ethernet/mellanox/mlx5/core/en.h       | 115 ++++----
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  79 +-----
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 310 +++++++++++----------
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |   5 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 284 ++++++++++---------
 drivers/net/ethernet/mellanox/mlx5/core/fw.c       |   2 +-
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  |  27 +-
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/transobj.c |  21 --
 include/linux/mlx5/device.h                        |   3 +
 include/linux/mlx5/mlx5_ifc.h                      |   7 +-
 include/linux/mlx5/transobj.h                      |   1 -
 12 files changed, 409 insertions(+), 447 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ