lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <22kf5wtxym5x3zllar7ek3onkav6nfzclf7w2lzifhebjme4jb@h4qycdqmwern>
Date: Fri, 4 Jul 2025 13:11:01 +0000
From: Dragos Tatulea <dtatulea@...dia.com>
To: Parav Pandit <parav@...dia.com>, Jakub Kicinski <kuba@...nel.org>
Cc: "almasrymina@...gle.com" <almasrymina@...gle.com>, 
	"asml.silence@...il.com" <asml.silence@...il.com>, Andrew Lunn <andrew+netdev@...n.ch>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
	Saeed Mahameed <saeedm@...dia.com>, Tariq Toukan <tariqt@...dia.com>, 
	Cosmin Ratiu <cratiu@...dia.com>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC net-next 1/4] net: Allow non parent devices to be used for
 ZC DMA

On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
> 
> > From: Jakub Kicinski <kuba@...nel.org>
> > Sent: 03 July 2025 02:23 AM
> > 
[...]
> > Maybe someone with closer understanding can chime in. If the kind of
> > subfunctions you describe are expected, and there's a generic way of
> > recognizing them -- automatically going to parent of parent would indeed be
> > cleaner and less error prone, as you suggest.
> 
> I am not sure when the parent of parent assumption would fail, but can be
> a good start.
> 
> If netdev 8 bytes extension to store dma_dev is concern,
> probably a netdev IFF_DMA_DEV_PARENT can be elegant to refer parent->parent?
> So that there is no guess work in devmem layer.
> 
> That said, my understanding of devmem is limited, so I could be mistaken here.
> 
> In the long term, the devmem infrastructure likely needs to be
> modernized to support queue-level DMA mapping.
> This is useful because drivers like mlx5 already support
> socket-direct netdev that span across two PCI devices.
> 
> Currently, devmem is limited to a single PCI device per netdev.
> While the buffer pool could be per device, the actual DMA
> mapping might need to be deferred until buffer posting
> time to support such multi-device scenarios.
> 
> In an offline discussion, Dragos mentioned that io_uring already
> operates at the queue level, may be some ideas can be picked up
> from io_uring?
The problem for devmem is that the device based API is already set in
stone so not sure how we can change this. Maybe Mina can chime in.

To sum the conversation up, there are 2 imperfect and overlapping
solutions:

1) For the common case of having a single PCI device per netdev, going one
   parent up if the parent device is not DMA capable would be a good
   starting point.

2) For multi-PF netdev [0], a per-queue get_dma_dev() op would be ideal
   as it provides the right PF device for the given queue. io_uring
   could use this but devmem can't. Devmem could use 1. but the
   driver has to detect and block the multi PF case.

I think we need both. Either that or a netdev op with an optional queue
parameter. Any thoughts?

[0] https://docs.kernel.org/networking/multi-pf-netdev.html

Thanks,
Dragos

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ