lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zfalpf8w.fsf@toke.dk>
Date: Tue, 23 Sep 2025 13:42:07 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org
Cc: bpf@...r.kernel.org, kuba@...nel.org, davem@...emloft.net,
 razor@...ckwall.org, pabeni@...hat.com, willemb@...gle.com,
 sdf@...ichev.me, john.fastabend@...il.com, martin.lau@...nel.org,
 jordan@...fe.io, maciej.fijalkowski@...el.com, magnus.karlsson@...el.com,
 David Wei <dw@...idwei.uk>
Subject: Re: [PATCH net-next 19/20] netkit: Add xsk support for af_xdp
 applications

Daniel Borkmann <daniel@...earbox.net> writes:

> Enable support for AF_XDP applications to operate on a netkit device.
> The goal is that AF_XDP applications can natively consume AF_XDP
> from network namespaces. The use-case from Cilium side is to support
> Kubernetes KubeVirt VMs through QEMU's AF_XDP backend. KubeVirt is a
> virtual machine management add-on for Kubernetes which aims to provide
> a common ground for virtualization. KubeVirt spawns the VMs inside
> Kubernetes Pods which reside in their own network namespace just like
> regular Pods.
>
> Raw QEMU AF_XDP backend example with eth0 being a physical device with
> 16 queues where netkit is bound to the last queue (for multi-queue RSS
> context can be used if supported by the driver):
>
>   # ethtool -X eth0 start 0 equal 15
>   # ethtool -X eth0 start 15 equal 1 context new
>   # ethtool --config-ntuple eth0 flow-type ether \
>             src 00:00:00:00:00:00 \
>             src-mask ff:ff:ff:ff:ff:ff \
>             dst $mac dst-mask 00:00:00:00:00:00 \
>             proto 0 proto-mask 0xffff action 15
>   # ip netns add foo
>   # ip link add numrxqueues 2 nk type netkit single
>   # ynl-bind eth0 15 nk
>   # ip link set nk netns foo
>   # ip netns exec foo ip link set lo up
>   # ip netns exec foo ip link set nk up
>   # ip netns exec foo qemu-system-x86_64 \
>           -kernel $kernel \
>           -drive file=${image_name},index=0,media=disk,format=raw \
>           -append "root=/dev/sda rw console=ttyS0" \
>           -cpu host \
>           -m $memory \
>           -enable-kvm \
>           -device virtio-net-pci,netdev=net0,mac=$mac \
>           -netdev af-xdp,ifname=nk,id=net0,mode=native,queues=1,start-queue=1,inhibit=on,map-path=$dir/xsks_map \
>           -nographic

So AFAICT, this example relies on the control plane installing an XDP
program on the physical NIC which will redirect into the right socket;
and since in this example, qemu will install the XSK socket at index 1
in the xsk map, that XDP program will also need to be aware of the queue
index mapping. I can see from your qemu commit[0] that there's support
on the qemu side for specifying an offset into the map to avoid having
to do this translation in the XDP program, but at the very least that
makes this example incomplete, no?

However, even with a complete example, this breaks isolation in the
sense that the entire XSK map is visible inside the pod, so a
misbehaving qemu could interfere with traffic on other queues (by
clearing the map, say). Which seems less than ideal?

Taking a step back, for AF_XDP we already support decoupling the
application-side access to the redirected packets from the interface,
through the use of sockets. Meaning that your use case here could just
as well be served by the control plane setting up AF_XDP socket(s) on
the physical NIC and passing those into qemu, in which case we don't
need this whole queue proxying dance at all.

So, erm, what am I missing that makes this worth it (for AF_XDP; I can
see how it is useful for other things)? :)

-Toke

[0] https://gitlab.com/qemu-project/qemu/-/commit/e53d9ec7ccc2dbb9378353fe2a89ebdca5cd7015


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ