netdev - Re: [net-next 02/10] net/mlx5e: Refactor xmit functions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+FuTSdNm1+sTwM-gikcHG6qyA9jXYqVtd3B0mrhir9F-mrrfA@mail.gmail.com>
Date:   Tue, 8 Sep 2020 11:08:28 +0200
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Maxim Mikityanskiy <maximmi@...dia.com>
Cc:     Saeed Mahameed <saeedm@...dia.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Network Development <netdev@...r.kernel.org>,
        Maxim Mikityanskiy <maximmi@...lanox.com>
Subject: Re: [net-next 02/10] net/mlx5e: Refactor xmit functions

On Tue, Sep 8, 2020 at 10:59 AM Maxim Mikityanskiy <maximmi@...dia.com> wrote:
>
> On 2020-09-04 18:27, Willem de Bruijn wrote:
> > On Thu, Sep 3, 2020 at 11:00 PM Saeed Mahameed <saeedm@...dia.com> wrote:
> >>
> >> From: Maxim Mikityanskiy <maximmi@...lanox.com>
> >>
> >> A huge function mlx5e_sq_xmit was split into several to achieve multiple
> >> goals:
> >>
> >> 1. Reuse the code in IPoIB.
> >>
> >> 2. Better intergrate with TLS, IPSEC, GENEVE and checksum offloads. Now
> >> it's possible to reserve space in the WQ before running eseg-based
> >> offloads, so:
> >>
> >> 2.1. It's not needed to copy cseg and eseg after mlx5e_fill_sq_frag_edge
> >> anymore.
> >>
> >> 2.2. mlx5e_txqsq_get_next_pi will be used instead of the legacy
> >> mlx5e_fill_sq_frag_edge for better code maintainability and reuse.
> >>
> >> 3. Prepare for the upcoming TX MPWQE for SKBs. It will intervene after
> >> mlx5e_sq_calc_wqe_attr to check if it's possible to use MPWQE, and the
> >> code flow will split into two paths: MPWQE and non-MPWQE.
> >>
> >> Two high-level functions are provided to send packets:
> >>
> >> * mlx5e_xmit is called by the networking stack, runs offloads and sends
> >> the packet. In one of the following patches, MPWQE support will be added
> >> to this flow.
> >>
> >> * mlx5e_sq_xmit_simple is called by the TLS offload, runs only the
> >> checksum offload and sends the packet.
> >>
> >> This change has no performance impact in TCP single stream test and
> >> XDP_TX single stream test.
> >>
> >> UDP pktgen (burst 32), single stream:
> >>    Packet rate: 17.55 Mpps -> 19.23 Mpps
> >>    Instructions per packet: 420 -> 360
> >>    Cycles per packet: 165 -> 142
> >>
> >> CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64)
> >> NIC: Mellanox ConnectX-6 Dx
> >>
> >> To get this performance gain, manual optimizations of function inlining
> >> were performed. It's important to have mlx5e_sq_xmit_wqe inline,
> >> otherwise the packet rate will be 1 Mpps less in UDP pktgen test.
> >> __always_inline is required, because gcc uninlines it when it's called
> >> from two places (mlx5e_xmit and mlx5e_sq_xmit_simple).
> >>
> >> Signed-off-by: Maxim Mikityanskiy <maximmi@...lanox.com>
> >> Signed-off-by: Saeed Mahameed <saeedm@...dia.com>
> >> ---
> >>   .../net/ethernet/mellanox/mlx5/core/en/txrx.h |  63 +--
> >>   .../mellanox/mlx5/core/en_accel/en_accel.h    |   5 +
> >>   .../mellanox/mlx5/core/en_accel/tls_rxtx.c    |   6 +-
> >>   .../net/ethernet/mellanox/mlx5/core/en_tx.c   | 391 ++++++++++--------
> >>   4 files changed, 243 insertions(+), 222 deletions(-)
> >
> > This combines a lot of changes. Including supposed noops, but with
> > subtle changes, like converting to struct initializers.
>
> Struct initializers are mostly used in the new code. I can split out the
> only converted occurrence.
>
> > Probably deserves to be broken up a bit more.
> >
> > For instance, a pure noop patch that moves
> > mlx5e_txwqe_build_eseg_csum,
>
> OK. Not sure I really need to move it though.

Even better.

In general, I don't really care how this patch is simplified, but as
is it is long and combines code moves, refactors that are supposedly a
noop and new functionality. I imagine that there must be some strategy
to break it up into sensible manageable chunks.