lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9a03d3bf-c48f-4758-9d7f-a5e7920ec68f@grimberg.me>
Date: Mon, 10 Jun 2024 17:30:34 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Christoph Hellwig <hch@....de>
Cc: Jakub Kicinski <kuba@...nel.org>, Aurelien Aptel <aaptel@...dia.com>,
 linux-nvme@...ts.infradead.org, netdev@...r.kernel.org, kbusch@...nel.org,
 axboe@...com, chaitanyak@...dia.com, davem@...emloft.net
Subject: Re: [PATCH v25 00/20] nvme-tcp receive offloads



On 10/06/2024 15:29, Christoph Hellwig wrote:
> On Mon, Jun 03, 2024 at 10:09:26AM +0300, Sagi Grimberg wrote:
>>> IETF has standardized a generic data placement protocol, which is
>>> part of iWarp.  Even if folks don't like RDMA it exists to solve
>>> exactly these kinds of problems of data placement.
>> iWARP changes the wire protocol.
> Compared to plain NVMe over TCP that's a bit of an understatement :)

Yes :) the comment was that people want to use NVMe/TCP, and adding
DDP awareness inspired by iWARP would change the existing NVMe/TCP wire 
protocol.

This offload, does not.

>
>> Is your comment to just go make people
>> use iWARP instead of TCP? or extending NVMe/TCP to natively support DDP?
> I don't know to be honest.  In many ways just using RDMA instead of
> NVMe/TCP would solve all the problems this is trying to solve, but
> there are enough big customers that have religious concerns about
> the use of RDMA.
>
> So if people want to use something that looks non-RDMA but have the
> same benefits we have to reinvent it quite similarly under a different
> name.  Looking at DDP and what we can learn from it without bringing
> the Verbs API along might be one way to do that.
>
> Another would be to figure out what amount of similarity and what
> amount of state we need in an on the wire protocol to have an
> efficient header splitting in the NIC, either hard coded or even
> better downloadable using something like eBPF.

 From what I understand, this is what this offload is trying to do. It uses
the nvme command_id similar to how the read_stag is used in iwarp,
it tracks the NVMe/TCP pdus to split pdus from data transfers, and maps
the command_id to an internal MR for dma purposes.

What I think you don't like about this is the interface that the offload 
exposes
to the TCP ulp driver (nvme-tcp in our case)?

>
>> That would be great, but what does a "vendor independent without hooks"
>> look like from
>> your perspective? I'd love having this translate to standard (and some new)
>> socket operations,
>> but I could not find a way that this can be done given the current
>> architecture.
> Any amount of calls into NIC/offload drivers from NVMe is a nogo.
>

Not following you here...
*something* needs to program a buffer for DDP, *something* needs to
invalidate this buffer, *something* needs to declare a TCP stream as DDP 
capable.

Unless I interpret what you're saying is that the interface needs to be 
generalized to
extend the standard socket operations (i.e. 
[s|g]etsockopt/recvmsg/cmsghdr etc) ?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ