lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izPKX3rhUytviDa-=Do=jt_gLzE+7H5_0jp+N-6hjHC8dQ@mail.gmail.com>
Date: Thu, 26 Sep 2024 13:50:57 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Stanislav Fomichev <sdf@...ichev.me>
Cc: netdev@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com, 
	kuba@...nel.org, pabeni@...hat.com
Subject: Re: [RFC PATCH net-next v1 1/4] net: devmem: Implement TX path

On Fri, Sep 13, 2024 at 8:09 AM Stanislav Fomichev <sdf@...ichev.me> wrote:
>
> Preliminary implementation of the TX path. The API is as follows:
>
> 1. bind-tx netlink call to attach dmabuf for TX; queue is not
>    required, only netdev for dmabuf attachment
> 2. a set of iovs where iov_base is the offset in the dmabuf and iov_len
>    is the size of the chunk to send; multiple iovs are supported
> 3. SCM_DEVMEM_DMABUF cmsg with the dmabuf id from bind-tx
> 4. MSG_SOCK_DEVMEM sendmsg flag to mirror receive path
>
> In sendmsg, lookup binding by id and refcnt it for every frag in the
> skb. None of the drivers are implemented, but skb_frag_dma_unmap
> should return proper DMA address. Extra care (TODO) must be taken in the
> drivers to not dma_unmap those mappings on completions.
>
> The changes in the kernel/dma/mapping.c are only required to make
> devmem work with virtual networking devices (and they expose 1:1
> identity mapping) and to enable 'loopback' mode. Loopback mode
> lets us test TCP and UAPI paths without having real HW. Not sure
> whether it should be a part of a real upstream submission, but it
> was useful during the development.
>
> TODO:
> - skb_page_unref and __skb_frag_ref seem out of place; unref paths
>   in general need more care
> - potentially something better than tx_iter/tx_vec with its
>   O(len/PAGE_SIZE) lookups
> - move xa_alloc_cyclic to the end
> - potentially better separate bind-rx and bind-tx;
>   direction == DMA_TO_DEVICE feels hacky
> - rename skb_add_rx_frag_netmem to skb_add_frag_netmem
>

Thank you very much for this, and sorry for the late reply. I think I
got busy with some post RX merge follow ups and then other stuff.
Coming back to look at this now.

This looks like a great start. Agreed with many of the todos above,
and in addition some things I wanna look deeper into (but not
necessarily set on changing yet):

Loopback: I do plan to drop that. My understanding is that it's a bit
complicated to make work. In addition to the mapping.c changes, the TX
zerocopy code falls back to copying for loopback for reasons I don't
have my head wrapped around. devmem can't be copied. You get around
that with a change in skb_copy_ubufs but I'm not sure we can assume
success there. In any case I don't have a use case for loop back and
it can be mode to work properly later.

control path locking: You added net_devmem_dmabuf_lock, but AFAICT
dma-buf allocation should be already mutexed by rtnl_lock. Maybe I
missed something. I'll take a deeper look.

fast path locking: you use rcu, which is a good way to do it. I had
something else in mind, where we associate the binding with a socket
and keep it alive for the duration of the socket and (I think) no need
to lock anymore. Not sure which is better. Associating the binding
with a socket does require uapi. But it may be good to keep the
binding alive while the socket is using it anyway, rather than the
sendmsg returning -EINVAL if the binding has been freed underneath it.
I'll take a deeper look.

get_page/put_page: I was thinking we need to implement
get_netmem/put_netmem equivalents as the tx path uses
get_page/put_page and page_pool refcounting is not used there. You
seem to instead ref/unref the binding. That may be fine, but we may
need get_page/put_page equivalents for netmem eventually and may be
worth getting them done now. I need to rack my brain a bit more.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ