lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <72746760-f045-d7bc-1557-255720d7638d@grimberg.me>
Date:   Thu, 23 Feb 2023 17:33:40 +0200
From:   Sagi Grimberg <sagi@...mberg.me>
To:     Aurelien Aptel <aaptel@...dia.com>, linux-nvme@...ts.infradead.org,
        netdev@...r.kernel.org, hch@....de, kbusch@...nel.org,
        axboe@...com, chaitanyak@...dia.com, davem@...emloft.net,
        kuba@...nel.org
Cc:     aurelien.aptel@...il.com, smalin@...dia.com, malin1024@...il.com,
        ogerlitz@...dia.com, yorayz@...dia.com, borisp@...dia.com
Subject: Re: [PATCH v11 00/25] nvme-tcp receive offloads


> Hi,
> 
> Here is the next iteration of our nvme-tcp receive offload series.
> 
> The main changes are in patch 3 (netlink).
> 
> Rebased on top of today net-next
> 8065c0e13f98 ("Merge branch 'yt8531-support'")
> 
> The changes are also available through git:
> 
> Repo: https://github.com/aaptel/linux.git branch nvme-rx-offload-v11
> Web: https://github.com/aaptel/linux/tree/nvme-rx-offload-v11
> 
> The NVMeTCP offload was presented in netdev 0x16 (video now available):
> - https://netdevconf.info/0x16/session.html?NVMeTCP-Offload-%E2%80%93-Implementation-and-Performance-Gains
> - https://youtu.be/W74TR-SNgi4
> 
> From: Aurelien Aptel <aaptel@...dia.com>
> From: Shai Malin <smalin@...dia.com>
> From: Ben Ben-Ishay <benishay@...dia.com>
> From: Boris Pismenny <borisp@...dia.com>
> From: Or Gerlitz <ogerlitz@...dia.com>
> From: Yoray Zack <yorayz@...dia.com>

Hey Aurelien and Co,

I've spent some time today looking at the last iteration of this,
What I cannot understand, is how will this ever be used outside
of the kernel nvme-tcp host driver?

It seems that the interface is diesigned to fit only a kernel
consumer, and a very specific one.

Have you considered using a more standard interfaces to use this
such that spdk or an io_uring based initiator can use it?

To me it appears that:
- ddp limits can be obtained via getsockopt
- sk_add/sk_del can be done via setsockopt
- offloaded DDGST crc can be obtained via something like
   msghdr.msg_control
- Perhaps for setting up the offload per IO, recvmsg would be the
   vehicle with a new msg flag MSG_RCV_DDP or something, that would hide
   all the details of what the HW needs (the command_id would be set
   somewhere in the msghdr).
- And all of the resync flow would be something that a separate
   ulp socket provider would take care of. Similar to how TLS presents
   itself to a tcp application. So the application does not need to be
   aware of it.

I'm not sure that such interface could cover everything that is needed, 
but what I'm trying to convey, is that the current interface limits the
usability for almost anything else. Please correct me if I'm wrong.
Is this designed to also cater anything else outside of the kernel
nvme-tcp host driver?

> Compatibility
> =============
> * The offload works with bare-metal or SRIOV.
> * The HW can support up to 64K connections per device (assuming no
>    other HW accelerations are used). In this series, we will introduce
>    the support for up to 4k connections, and we have plans to increase it.
> * SW TLS could not work together with the NVMeTCP offload as the HW
>    will need to track the NVMeTCP headers in the TCP stream.

Can't say I like that.

> * The ConnectX HW support HW TLS, but in ConnectX-7 those features
>    could not co-exists (and it is not part of this series).
> * The NVMeTCP offload ConnectX 7 HW can support tunneling, but we
>    don’t see the need for this feature yet.
> * NVMe poll queues are not in the scope of this series.

bonding/teaming?

> 
> Future Work
> ===========
> * NVMeTCP transmit offload.
> * NVMeTCP host offloads incremental features.
> * NVMeTCP target offload.

Which target? which host?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ