[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y5t5MZH1UwfLqhNC@C02F109XMD6R.local>
Date: Thu, 15 Dec 2022 13:50:15 -0600
From: Alex Forster <aforster@...udflare.com>
To: Magnus Karlsson <magnus.karlsson@...il.com>
Cc: Shawn Bohrer <sbohrer@...udflare.com>, netdev@...r.kernel.org,
bpf@...r.kernel.org, bjorn@...nel.org, magnus.karlsson@...el.com,
kernel-team@...udflare.com
Subject: Re: Possible race with xsk_flush
Hi Magnus,
> Could you please share how you set up the two AF_XDP sockets?
Our architecture is pretty unique:
outside of │ inside of
namespace │ namespace
│
┌───────┐ │ ┌───────┐
│ outer │ │ │ inner │
│ veth │ │ │ veth │
└──┬─▲──┘ │ └──┬─▲──┘
│ │ │ │ │
┌──▼─┴────┴────▼─┴──┐
│ shared umem │
└───────────────────┘
The goal is to position ourselves in the middle of a veth pair so that
we can perform bidirectional traffic inspection and manipulation. To do
this, we attach AF_XDP to both veth interfaces and share a umem between
them. This allows us to forward packets between the veth interfaces
without copying in userspace.
These interfaces are both multi-queue, with AF_XDP sockets attached to
each queue. The queues are each managed on their own (unpinned) threads
and have their own rx/tx/fill/completion rings. We also enable
threaded NAPI on both of these interfaces, which may or may not be an
important detail to note, since the problem appears much harder (though
not impossible) to reproduce with threaded NAPI enabled.
Here’s a script that configures a namespace and veth pair that closely
resembles production, except for enabling threaded NAPI:
```
#!/bin/bash
set -e -u -x -o pipefail
QUEUES=${QUEUES:=$(($(grep -c ^processor /proc/cpuinfo)))}
OUTER_CUSTOMER_VETH=${OUTER_CUSTOMER_VETH:=outer-veth}
INNER_CUSTOMER_VETH=${INNER_CUSTOMER_VETH:=inner-veth}
CUSTOMER_NAMESPACE=${CUSTOMER_NAMESPACE:=customer-namespace}
ip netns add $CUSTOMER_NAMESPACE
ip netns exec $CUSTOMER_NAMESPACE bash <<EOF
set -e -u -x -o pipefail
ip addr add 127.0.0.1/8 dev lo
ip link set dev lo up
EOF
ip link add \
name $OUTER_CUSTOMER_VETH \
numrxqueues $QUEUES numtxqueues $QUEUES type veth \
peer name $INNER_CUSTOMER_VETH netns $CUSTOMER_NAMESPACE \
numrxqueues $QUEUES numtxqueues $QUEUES
ethtool -K $OUTER_CUSTOMER_VETH \
gro off gso off tso off tx off rxvlan off txvlan off
ip link set dev $OUTER_CUSTOMER_VETH up
ip addr add 169.254.10.1/30 dev $OUTER_CUSTOMER_VETH
ip netns exec $CUSTOMER_NAMESPACE bash <<EOF
set -e -u -x -o pipefail
ethtool -K $INNER_CUSTOMER_VETH \
gro off gso off tso off tx off rxvlan off txvlan off
ip link set dev $INNER_CUSTOMER_VETH up
ip addr add 169.254.10.2/30 dev $INNER_CUSTOMER_VETH
EOF
```
> Are you using XDP_DRV mode in your tests?
Yes.
Powered by blists - more mailing lists