lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7ddaf65-00a0-4b90-b596-5db1a5169950@kernel.dk>
Date: Thu, 31 Jul 2025 07:29:25 -0600
From: Jens Axboe <axboe@...nel.dk>
To: Aurelien Aptel <aaptel@...dia.com>, linux-nvme@...ts.infradead.org,
 netdev@...r.kernel.org, sagi@...mberg.me, hch@....de, kbusch@...nel.org,
 axboe@...com, chaitanyak@...dia.com, davem@...emloft.net, kuba@...nel.org
Cc: aurelien.aptel@...il.com, smalin@...dia.com, malin1024@...il.com,
 ogerlitz@...dia.com, yorayz@...dia.com, borisp@...dia.com,
 galshalom@...dia.com, mgurtovoy@...dia.com, tariqt@...dia.com,
 gus@...labora.com, viro@...iv.linux.org.uk, akpm@...ux-foundation.org
Subject: Re: [PATCH v30 03/20] iov_iter: skip copy if src == dst for direct
 data placement

On 7/25/25 10:22 AM, Aurelien Aptel wrote:
> Jens Axboe <axboe@...nel.dk> writes:
>> This seems like entirely the wrong place to apply this logic...
> 
> I understand it might look strange on first sight to have the check
> implemented in this low level iterator copy function.
> 
> But it is actually where it makes the most sense.

Please stop rationalizing what is a blatant layering violation.

> Let's assume we move it to the caller, nvme-tcp. We now need read our
> packets from the socket but somehow when we reach the offloaded payload,
> skip ahead. There is no function for that in the socket API. We can walk
> the SKB fragments but a single SKB can contain a combination of
> offloaded and non-offloaded chunks. You now need to re-implement
> skb_copy_datagram() to know what to copy to and where at the socket
> layer. Even if you reuse the callback API you have to take care of the
> destination bvec iterator. You end up duplicating that iterator
> map/step/advance logic.
> 
> Our design is transparent to applications. The application does not need
> to handle the skipping logic, and the skipping is done in a generic way
> in the underlying copy function. It also will work for other protocol
> with no extra code. All the work will happen in the driver, which needs
> to construct SKBs using the application buffers.  Making drivers
> communicate information to the upper layer is already a common
> practice. The SKBs are already assembled by drivers today according to
> how the data is received from the network.
> 
> We have covered some of the design decisions in previous discussion,
> more recently on v25. I suggest you can take a look [1].
> 
> Regarding performances, we were not able to see any measurable
> differences between fio runs with the CONFIG disabled and enabled.

I'm not at all worried about performance, I'm worried about littering
completely generic helper code with random driver checks. And you
clearly WERE worried about performance, since otherwise why would you
even add a IS_ENABLED(CONFIG_ULP_DDP) in the first place if not?

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ