lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQuq1mhm7cM8kkLY@mini-arch>
Date: Wed, 5 Nov 2025 11:51:50 -0800
From: Stanislav Fomichev <stfomichev@...il.com>
To: David Wei <dw@...idwei.uk>
Cc: Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org,
	bpf@...r.kernel.org, kuba@...nel.org, davem@...emloft.net,
	razor@...ckwall.org, pabeni@...hat.com, willemb@...gle.com,
	sdf@...ichev.me, john.fastabend@...il.com, martin.lau@...nel.org,
	jordan@...fe.io, maciej.fijalkowski@...el.com,
	magnus.karlsson@...el.com, toke@...hat.com,
	yangzhenze@...edance.com, wangdongdong.6@...edance.com
Subject: Re: [PATCH net-next v4 00/14] netkit: Support for io_uring zero-copy
 and AF_XDP

On 11/04, David Wei wrote:
> On 2025-11-04 15:22, Stanislav Fomichev wrote:
> > On 10/31, Daniel Borkmann wrote:
> > > Containers use virtual netdevs to route traffic from a physical netdev
> > > in the host namespace. They do not have access to the physical netdev
> > > in the host and thus can't use memory providers or AF_XDP that require
> > > reconfiguring/restarting queues in the physical netdev.
> > > 
> > > This patchset adds the concept of queue peering to virtual netdevs that
> > > allow containers to use memory providers and AF_XDP at native speed.
> > > These mapped queues are bound to a real queue in a physical netdev and
> > > act as a proxy.
> > > 
> > > Memory providers and AF_XDP operations takes an ifindex and queue id,
> > > so containers would pass in an ifindex for a virtual netdev and a queue
> > > id of a mapped queue, which then gets proxied to the underlying real
> > > queue. Peered queues are created and bound to a real queue atomically
> > > through a generic ynl netdev operation.
> > > 
> > > We have implemented support for this concept in netkit and tested the
> > > latter against Nvidia ConnectX-6 (mlx5) as well as Broadcom BCM957504
> > > (bnxt_en) 100G NICs. For more details see the individual patches.
> > > 
> > > v3->v4:
> > >   - ndo_queue_create store dst queue via arg (Nikolay)
> > >   - Small nits like a spelling issue + rev xmas (Nikolay)
> > >   - admin-perm flag in bind-queue spec (Jakub)
> > >   - Fix potential ABBA deadlock situation in bind (Jakub, Paolo, Stan)
> > >   - Add a peer dev_tracker to not reuse the sysfs one (Jakub)
> > >   - New patch (12/14) to handle the underlying device going away (Jakub)
> > >   - Improve commit message on queue-get (Jakub)
> > >   - Do not expose phys dev info from container on queue-get (Jakub)
> > >   - Add netif_put_rx_queue_peer_locked to simplify code (Stan)
> > >   - Rework xsk handling to simplify the code and drop a few patches
> > >   - Rebase and retested everything with mlx5 + bnxt_en
> > 
> > I mostly looked at patches 1-8 and they look good to me. Will it be
> > possible to put your sample runs from 13 and 14 into a selftest form? Even
> > if you require real hw, that should be doable, similar to
> > tools/testing/selftests/drivers/net/hw/devmem.py, right?
> 
> Thanks for taking a look. For io_uring at least, it requires both a
> routable VIP that can be assigned to the netkit in a netns and a BPF
> program for skb forwarding. I could add a selftest, but it'll be hard to
> generalise across all envs. I'm hoping to get self contained QEMU VM
> selftest support first. WDYT?

You can start at least with having what you have in patch 3 as a
selftest. NIPA runs with fbnic qemu model, you should be able to at
least test the netns setup, make sure peer-info works as expected, etc.
You can verify that things like changing the number of channels are
blocked when you have the queued bound to netkit..

But also, regarding the datapath test, not sure you need another qemu. Not
even sure why you need a vip? You can carve a single port and share
the same host ip in the netns? Alternatively I think you can carve
out 192.168.x.y from /32 and assign it to the machine. We have datapath
devmem tests working without any special qemu vms (besides, well,
special fbnic qemu, but you should be able to test on it as well).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ