lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2537c2gzk6x.fsf@nvidia.com>
Date: Fri, 16 May 2025 17:47:34 +0300
From: Aurelien Aptel <aaptel@...dia.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: linux-nvme@...ts.infradead.org, netdev@...r.kernel.org,
 sagi@...mberg.me, hch@....de, kbusch@...nel.org, axboe@...com,
 chaitanyak@...dia.com, davem@...emloft.net, kuba@...nel.org, Boris
 Pismenny <borisp@...dia.com>, aurelien.aptel@...il.com, smalin@...dia.com,
 malin1024@...il.com, ogerlitz@...dia.com, yorayz@...dia.com,
 galshalom@...dia.com, mgurtovoy@...dia.com, tariqt@...dia.com,
 gus@...labora.com, pabeni@...hat.com, dsahern@...nel.org, ast@...nel.org,
 jacob.e.keller@...el.com
Subject: Re: [PATCH v28 01/20] net: Introduce direct data placement tcp offload

Hi Eric,

We have looked into your suggestions, but both have drawbacks.

The first idea was to make the tailroom small/empty to prevent
condensing. The issue is that the header is already placed at the skb
head, and there could be another PDU after the first payload. Placing
the header at the tail of the skb would require copying (which we want
to avoid) and could potentially overwrite anything after it.

The second idea was to use the unreadable bit. We tried setting the bit
in the driver and updating tcp_collapse() to copy the bit along with
other bits. However, making the skb unreadable causes issues at the
other end when the nvme driver reads from it, as the unreadable bit
makes it, well, unreadable. If you look at __skb_datagram_iter(), you'll
see it errs out if skb_frags_readable(skb) is false.

The offload works by calling the iter copy functions while skipping the
memcpy (see patch 3).  We think the unreadable bit is getting close to
what we want if it wasn't for the skb_datagram_iter() check. Maybe the
bit could be unset at a later stage but it's not clear where.
Alternatively, the no_condense bit might be a good compromise? readable
but not condensable.

Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ