netdev - RE: [PATCH v25 00/20] nvme-tcp receive offloads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <df4db10f6f3946e29e7e7340cfa82c33@AcuMS.aculab.com>
Date: Sat, 15 Jun 2024 21:34:52 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Christoph Hellwig' <hch@....de>, Sagi Grimberg <sagi@...mberg.me>
CC: Jakub Kicinski <kuba@...nel.org>, Aurelien Aptel <aaptel@...dia.com>,
	"linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "kbusch@...nel.org"
	<kbusch@...nel.org>, "axboe@...com" <axboe@...com>, "chaitanyak@...dia.com"
	<chaitanyak@...dia.com>, "davem@...emloft.net" <davem@...emloft.net>
Subject: RE: [PATCH v25 00/20] nvme-tcp receive offloads

From: Christoph Hellwig
> Sent: 11 June 2024 07:42
> 
> On Mon, Jun 10, 2024 at 05:30:34PM +0300, Sagi Grimberg wrote:
> >> efficient header splitting in the NIC, either hard coded or even
> >> better downloadable using something like eBPF.
> >
> > From what I understand, this is what this offload is trying to do. It uses
> > the nvme command_id similar to how the read_stag is used in iwarp,
> > it tracks the NVMe/TCP pdus to split pdus from data transfers, and maps
> > the command_id to an internal MR for dma purposes.
> >
> > What I think you don't like about this is the interface that the offload
> > exposes
> > to the TCP ulp driver (nvme-tcp in our case)?
> 
> I don't see why a memory registration is needed at all.
> 
> The by far biggest painpoint when doing storage protocols (including
> file systems) over IP based storage is the data copy on the receive
> path because the payload is not aligned to a page boundary.

How much does the copy cost anyway?
If the hardware has merged the segments then it should be a single copy.
On x86 (does anyone care about anything else :-) 'rep mosvb' with a
cache-line aligned destination runs at 64 bytes/clock.
(The source alignment doesn't matter at all.)
I guess it loads the source data into the D-cache, the target is probably
required anyway - or you wouldn't be doing a read.

	David

> 
> So we need to figure out a way that is as stateless as possible that
> allows aligning the actual data payload on a page boundary in an
> otherwise normal IP receive path.

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)