netdev - Re: [PATCH v1 net-next 02/15] net: Introduce direct data placement tcp offload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6f48fa5d-465c-5c38-ea45-704e86ba808b@gmail.com>
Date:   Mon, 7 Dec 2020 17:42:14 -0700
From:   David Ahern <dsahern@...il.com>
To:     Boris Pismenny <borisp@...lanox.com>, kuba@...nel.org,
        davem@...emloft.net, saeedm@...dia.com, hch@....de,
        sagi@...mberg.me, axboe@...com, kbusch@...nel.org,
        viro@...iv.linux.org.uk, edumazet@...gle.com
Cc:     boris.pismenny@...il.com, linux-nvme@...ts.infradead.org,
        netdev@...r.kernel.org, benishay@...dia.com, ogerlitz@...dia.com,
        yorayz@...dia.com, Ben Ben-Ishay <benishay@...lanox.com>,
        Or Gerlitz <ogerlitz@...lanox.com>,
        Yoray Zack <yorayz@...lanox.com>
Subject: Re: [PATCH v1 net-next 02/15] net: Introduce direct data placement
 tcp offload

On 12/7/20 2:06 PM, Boris Pismenny wrote:
> This commit introduces direct data placement offload for TCP.
> This capability is accompanied by new net_device operations that
> configure
> hardware contexts. There is a context per socket, and a context per DDP
> opreation. Additionally, a resynchronization routine is used to assist
> hardware handle TCP OOO, and continue the offload.
> Furthermore, we let the offloading driver advertise what is the max hw
> sectors/segments.
> 
> Using this interface, the NIC hardware will scatter TCP payload directly
> to the BIO pages according to the command_id.
> To maintain the correctness of the network stack, the driver is expected
> to construct SKBs that point to the BIO pages.
> 
> This, the SKB represents the data on the wire, while it is pointing
> to data that is already placed in the destination buffer.
> As a result, data from page frags should not be copied out to
> the linear part.
> 
> As SKBs that use DDP are already very memory efficient, we modify
> skb_condence to avoid copying data from fragments to the linear
> part of SKBs that belong to a socket that uses DDP offload.
> 
> A follow-up patch will use this interface for DDP in NVMe-TCP.
> 

You call this Direct Data Placement - which sounds like a marketing name.

Fundamentally, this starts with offloading TCP socket buffers for a
specific flow, so generically a TCP Rx zerocopy for kernel stack managed
sockets (as opposed to AF_XDP's zerocopy). Why is this not building in
that level of infrastructure first and adding ULPs like NVME on top?