netdev - Re: [PATCH v2 net-next 19/21] net/mlx5e: NVMEoTCP, data-path for DDP offload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2a0bfce0-6226-7b9a-95b1-15f4f1f321e8@gmail.com>
Date:   Mon, 18 Jan 2021 21:36:30 -0700
From:   David Ahern <dsahern@...il.com>
To:     Boris Pismenny <borispismenny@...il.com>,
        Boris Pismenny <borisp@...lanox.com>, kuba@...nel.org,
        davem@...emloft.net, saeedm@...dia.com, hch@....de,
        sagi@...mberg.me, axboe@...com, kbusch@...nel.org,
        viro@...iv.linux.org.uk, edumazet@...gle.com, smalin@...vell.com
Cc:     boris.pismenny@...il.com, linux-nvme@...ts.infradead.org,
        netdev@...r.kernel.org, benishay@...dia.com, ogerlitz@...dia.com,
        yorayz@...dia.com, Or Gerlitz <ogerlitz@...lanox.com>,
        Yoray Zack <yorayz@...lanox.com>
Subject: Re: [PATCH v2 net-next 19/21] net/mlx5e: NVMEoTCP, data-path for DDP
 offload

On 1/17/21 1:42 AM, Boris Pismenny wrote:
> This is needed for a few reasons that are explained in detail
> in the tcp-ddp offload documentation. See patch 21 overview
> and rx-data-path sections. Our reasons are as follows:

I read the documentation patch, and it does not explain it and really
should not since this is very mlx specific based on the changes.
Different h/w will have different limitations. Given that, it would be
best to enhance the patch description to explain why these gymnastics
are needed for the skb.

> 1) Each SKB may contain multiple PDUs. DDP offload doesn't operate on
> PDU headers, so these are written in the receive ring. Therefore, we
> need to rebuild the SKB to account for it. Additionally, due to HW
> limitations, we will only offload the first PDU in the SKB.

Are you referring to LRO skbs here? I can't imagine going through this
for 1500 byte packets that have multiple PDUs.


> 2) The newly constructed SKB represents the original data as it is on
> the wire, such that the network stack is oblivious to the offload.
> 3) We decided not to modify all of the mlx5e_skb_from_cqe* functions
> because it would make the offload harder to distinguish, and it would
> add overhead to the existing data-path fucntions. Therefore, we opted
> for this modular approach.
> 
> If we only had generic header-data split, then we just couldn't
> provide this offload. It is not enough to place payload into some
> buffer without TCP headers because RPC protocols and advanced storage
> protocols, such as nvme-tcp, reorder their responses and require data
> to be placed into application/pagecache buffers, which are anything
> but anonymous. In other words, header-data split alone writes data
> to the wrong buffers (reordering), or to anonymous buffers that
> can't be page-flipped to replace application/pagecache buffers.
>