lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 10 Aug 2023 11:44:53 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Christian König <christian.koenig@....com>
Cc: netdev@...r.kernel.org, linux-media@...r.kernel.org, 
	dri-devel@...ts.freedesktop.org, "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Jesper Dangaard Brouer <hawk@...nel.org>, Ilias Apalodimas <ilias.apalodimas@...aro.org>, 
	Arnd Bergmann <arnd@...db.de>, David Ahern <dsahern@...nel.org>, 
	Willem de Bruijn <willemdebruijn.kernel@...il.com>, Sumit Semwal <sumit.semwal@...aro.org>, 
	Jason Gunthorpe <jgg@...pe.ca>, Hari Ramakrishnan <rharix@...gle.com>, 
	Dan Williams <dan.j.williams@...el.com>, Andy Lutomirski <luto@...nel.org>, stephen@...workplumber.org, 
	sdf@...gle.com
Subject: Re: [RFC PATCH v2 00/11] Device Memory TCP

On Thu, Aug 10, 2023 at 3:29 AM Christian König
<christian.koenig@....com> wrote:
>
> Am 10.08.23 um 03:57 schrieb Mina Almasry:
> > Changes in RFC v2:
> > ------------------
> >
> > The sticking point in RFC v1[1] was the dma-buf pages approach we used to
> > deliver the device memory to the TCP stack. RFC v2 is a proof-of-concept
> > that attempts to resolve this by implementing scatterlist support in the
> > networking stack, such that we can import the dma-buf scatterlist
> > directly.
>
> Impressive work, I didn't thought that this would be possible that "easily".
>
> Please note that we have considered replacing scatterlists with simple
> arrays of DMA-addresses in the DMA-buf framework to avoid people trying
> to access the struct page inside the scatterlist.
>

FWIW, I'm not doing anything with the struct pages inside the
scatterlist. All I need from the scatterlist are the
sg_dma_address(sg) and the sg_dma_len(sg), and I'm guessing the array
you're describing will provide exactly those, but let me know if I
misunderstood.

> It might be a good idea to push for that first before this here is
> finally implemented.
>
> GPU drivers already convert the scatterlist used to arrays of
> DMA-addresses as soon as they get them. This leaves RDMA and V4L as the
> other two main users which would need to be converted.
>
> >   This is the approach proposed at a high level here[2].
> >
> > Detailed changes:
> > 1. Replaced dma-buf pages approach with importing scatterlist into the
> >     page pool.
> > 2. Replace the dma-buf pages centric API with a netlink API.
> > 3. Removed the TX path implementation - there is no issue with
> >     implementing the TX path with scatterlist approach, but leaving
> >     out the TX path makes it easier to review.
> > 4. Functionality is tested with this proposal, but I have not conducted
> >     perf testing yet. I'm not sure there are regressions, but I removed
> >     perf claims from the cover letter until they can be re-confirmed.
> > 5. Added Signed-off-by: contributors to the implementation.
> > 6. Fixed some bugs with the RX path since RFC v1.
> >
> > Any feedback welcome, but specifically the biggest pending questions
> > needing feedback IMO are:
> >
> > 1. Feedback on the scatterlist-based approach in general.
>
> As far as I can see this sounds like the right thing to do in general.
>
> Question is rather if we should stick with scatterlist, use array of
> DMA-addresses or maybe even come up with a completely new structure.
>

As far as I can tell, it should be trivial to switch this device
memory TCP implementation to anything that provides:

1. DMA-addresses (sg_dma_address() equivalent)
2. lengths (sg_dma_len() equivalent)

if you go that route. Specifically, I think it will be pretty much a
localized change to netdev_bind_dmabuf_to_queue() implemented in this
patch:
https://lore.kernel.org/netdev/ZNULIDzuVVyfyMq2@ziepe.ca/T/#m2d344b08f54562cc9155c3f5b018cbfaed96036f

> > 2. Netlink API (Patch 1 & 2).
>
> How does netlink manage the lifetime of objects?
>

Netlink itself doesn't handle the lifetime of the binding. However,
the API I implemented unbinds the dma-buf when the netlink socket is
destroyed. I do this so that even if the user process crashes or
forgets to unbind, the dma-buf will still be unbound once the netlink
socket is closed on the process exit. Details in this patch:
https://lore.kernel.org/netdev/ZNULIDzuVVyfyMq2@ziepe.ca/T/#m2d344b08f54562cc9155c3f5b018cbfaed96036f

On Thu, Aug 10, 2023 at 9:07 AM Jason Gunthorpe <jgg@...pe.ca> wrote:
>
> On Thu, Aug 10, 2023 at 12:29:08PM +0200, Christian König wrote:
> > Am 10.08.23 um 03:57 schrieb Mina Almasry:
> > > Changes in RFC v2:
> > > ------------------
> > >
> > > The sticking point in RFC v1[1] was the dma-buf pages approach we used to
> > > deliver the device memory to the TCP stack. RFC v2 is a proof-of-concept
> > > that attempts to resolve this by implementing scatterlist support in the
> > > networking stack, such that we can import the dma-buf scatterlist
> > > directly.
> >
> > Impressive work, I didn't thought that this would be possible that "easily".
> >
> > Please note that we have considered replacing scatterlists with simple
> > arrays of DMA-addresses in the DMA-buf framework to avoid people trying to
> > access the struct page inside the scatterlist.
> >
> > It might be a good idea to push for that first before this here is finally
> > implemented.
> >
> > GPU drivers already convert the scatterlist used to arrays of DMA-addresses
> > as soon as they get them. This leaves RDMA and V4L as the other two main
> > users which would need to be converted.
>
> Oh that would be a nightmare for RDMA.
>
> We need a standard based way to have scalable lists of DMA addresses :(
>
> > > 2. Netlink API (Patch 1 & 2).
> >
> > How does netlink manage the lifetime of objects?
>
> And access control..
>

Someone will correct me if I'm wrong but I'm not sure netlink itself
will do (sufficient) access control. However I meant for the netlink
API to bind dma-bufs to be a CAP_NET_ADMIN API, and I forgot to add
this check in this proof-of-concept, sorry. I'll add a CAP_NET_ADMIN
check in netdev_bind_dmabuf_to_queue() in the next iteration.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ