[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51de8bdf-e8e9-418b-8d6e-c559b8e831df@blackwall.org>
Date: Mon, 22 Sep 2025 15:05:25 +0300
From: Nikolay Aleksandrov <razor@...ckwall.org>
To: Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org
Cc: bpf@...r.kernel.org, kuba@...nel.org, davem@...emloft.net,
pabeni@...hat.com, willemb@...gle.com, sdf@...ichev.me,
john.fastabend@...il.com, martin.lau@...nel.org, jordan@...fe.io,
maciej.fijalkowski@...el.com, magnus.karlsson@...el.com
Subject: Re: [PATCH net-next 00/20] netkit: Support for io_uring zero-copy and
AF_XDP
On 9/20/25 00:31, Daniel Borkmann wrote:
> Containers use virtual netdevs to route traffic from a physical netdev
> in the host namespace. They do not have access to the physical netdev
> in the host and thus can't use memory providers or AF_XDP that require
> reconfiguring/restarting queues in the physical netdev.
>
> This patchset adds the concept of queue peering to virtual netdevs that
> allow containers to use memory providers and AF_XDP at _native speed_!
> These mapped queues are bound to a real queue in a physical netdev and
> act as a proxy.
>
> Memory providers and AF_XDP operations takes an ifindex and queue id,
> so containers would pass in an ifindex for a virtual netdev and a queue
> id of a mapped queue, which then gets proxied to the underlying real
> queue. Peered queues are created and bound to a real queue atomically
> through a generic ynl netdev operation.
>
> We have implemented support for this concept in netkit and tested the
> latter against Nvidia ConnectX-6 (mlx5) as well as Broadcom BCM957504
> (bnxt_en) 100G NICs. For more details see the individual patches.
>
> Daniel Borkmann (10):
> net: Add ndo_{peer,unpeer}_queues callback
> net, ethtool: Disallow mapped real rxqs to be resized
> xsk: Move NETDEV_XDP_ACT_ZC into generic header
> xsk: Move pool registration into single function
> xsk: Add small helper xp_pool_bindable
> xsk: Change xsk_rcv_check to check netdev/queue_id from pool
> xsk: Proxy pool management for mapped queues
> netkit: Add single device mode for netkit
> netkit: Document fast vs slowpath members via macros
> netkit: Add xsk support for af_xdp applications
>
> David Wei (10):
> net, ynl: Add bind-queue operation
> net: Add peer to netdev_rx_queue
> net: Add ndo_queue_create callback
> net, ynl: Implement netdev_nl_bind_queue_doit
> net, ynl: Add peer info to queue-get response
> net: Proxy net_mp_{open,close}_rxq for mapped queues
> netkit: Implement rtnl_link_ops->alloc
> netkit: Implement ndo_queue_create
> netkit: Add io_uring zero-copy support for TCP
> tools, ynl: Add queue binding ynl sample application
>
> Documentation/netlink/specs/netdev.yaml | 54 ++++
> drivers/net/netkit.c | 362 ++++++++++++++++++++----
> include/linux/netdevice.h | 15 +-
> include/net/netdev_queues.h | 1 +
> include/net/netdev_rx_queue.h | 55 ++++
> include/net/xdp_sock_drv.h | 8 +-
> include/uapi/linux/if_link.h | 6 +
> include/uapi/linux/netdev.h | 20 ++
> net/core/netdev-genl-gen.c | 14 +
> net/core/netdev-genl-gen.h | 1 +
> net/core/netdev-genl.c | 144 +++++++++-
> net/core/netdev_rx_queue.c | 15 +-
> net/ethtool/channels.c | 10 +-
> net/xdp/xsk.c | 27 +-
> net/xdp/xsk.h | 5 +-
> net/xdp/xsk_buff_pool.c | 29 +-
> tools/include/uapi/linux/netdev.h | 20 ++
> tools/net/ynl/samples/bind.c | 56 ++++
> 18 files changed, 750 insertions(+), 92 deletions(-)
> create mode 100644 tools/net/ynl/samples/bind.c
>
I have reviewed the set and it looks good to me. To be fair, I have reviewed
it privately before as well. I really like the changes, we have discussed some
of the ideas implemented before. Personally I especially like the io_uring support
and think that some new interesting use cases will come out of it.
Nice work, for the set:
Reviewed-by: Nikolay Aleksandrov <razor@...ckwall.org>
Cheers,
Nik
Powered by blists - more mailing lists