lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 15 Dec 2022 13:50:15 -0600 From: Alex Forster <aforster@...udflare.com> To: Magnus Karlsson <magnus.karlsson@...il.com> Cc: Shawn Bohrer <sbohrer@...udflare.com>, netdev@...r.kernel.org, bpf@...r.kernel.org, bjorn@...nel.org, magnus.karlsson@...el.com, kernel-team@...udflare.com Subject: Re: Possible race with xsk_flush Hi Magnus, > Could you please share how you set up the two AF_XDP sockets? Our architecture is pretty unique: outside of │ inside of namespace │ namespace │ ┌───────┐ │ ┌───────┐ │ outer │ │ │ inner │ │ veth │ │ │ veth │ └──┬─▲──┘ │ └──┬─▲──┘ │ │ │ │ │ ┌──▼─┴────┴────▼─┴──┐ │ shared umem │ └───────────────────┘ The goal is to position ourselves in the middle of a veth pair so that we can perform bidirectional traffic inspection and manipulation. To do this, we attach AF_XDP to both veth interfaces and share a umem between them. This allows us to forward packets between the veth interfaces without copying in userspace. These interfaces are both multi-queue, with AF_XDP sockets attached to each queue. The queues are each managed on their own (unpinned) threads and have their own rx/tx/fill/completion rings. We also enable threaded NAPI on both of these interfaces, which may or may not be an important detail to note, since the problem appears much harder (though not impossible) to reproduce with threaded NAPI enabled. Here’s a script that configures a namespace and veth pair that closely resembles production, except for enabling threaded NAPI: ``` #!/bin/bash set -e -u -x -o pipefail QUEUES=${QUEUES:=$(($(grep -c ^processor /proc/cpuinfo)))} OUTER_CUSTOMER_VETH=${OUTER_CUSTOMER_VETH:=outer-veth} INNER_CUSTOMER_VETH=${INNER_CUSTOMER_VETH:=inner-veth} CUSTOMER_NAMESPACE=${CUSTOMER_NAMESPACE:=customer-namespace} ip netns add $CUSTOMER_NAMESPACE ip netns exec $CUSTOMER_NAMESPACE bash <<EOF set -e -u -x -o pipefail ip addr add 127.0.0.1/8 dev lo ip link set dev lo up EOF ip link add \ name $OUTER_CUSTOMER_VETH \ numrxqueues $QUEUES numtxqueues $QUEUES type veth \ peer name $INNER_CUSTOMER_VETH netns $CUSTOMER_NAMESPACE \ numrxqueues $QUEUES numtxqueues $QUEUES ethtool -K $OUTER_CUSTOMER_VETH \ gro off gso off tso off tx off rxvlan off txvlan off ip link set dev $OUTER_CUSTOMER_VETH up ip addr add 169.254.10.1/30 dev $OUTER_CUSTOMER_VETH ip netns exec $CUSTOMER_NAMESPACE bash <<EOF set -e -u -x -o pipefail ethtool -K $INNER_CUSTOMER_VETH \ gro off gso off tso off tx off rxvlan off txvlan off ip link set dev $INNER_CUSTOMER_VETH up ip addr add 169.254.10.2/30 dev $INNER_CUSTOMER_VETH EOF ``` > Are you using XDP_DRV mode in your tests? Yes.
Powered by blists - more mailing lists