[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f5897f68-43fe-8ba9-2ef9-05556eb43bff@lab.ntt.co.jp>
Date: Tue, 24 Jul 2018 11:11:23 +0900
From: Toshiaki Makita <makita.toshiaki@....ntt.co.jp>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>,
Toshiaki Makita <toshiaki.makita1@...il.com>
Cc: netdev@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <brouer@...hat.com>,
tariqt@...lanox.com
Subject: Re: [PATCH v3 bpf-next 5/8] veth: Add ndo_xdp_xmit
On 2018/07/24 10:02, Jakub Kicinski wrote:
> On Mon, 23 Jul 2018 00:13:05 +0900, Toshiaki Makita wrote:
>> From: Toshiaki Makita <makita.toshiaki@....ntt.co.jp>
>>
>> This allows NIC's XDP to redirect packets to veth. The destination veth
>> device enqueues redirected packets to the napi ring of its peer, then
>> they are processed by XDP on its peer veth device.
>> This can be thought as calling another XDP program by XDP program using
>> REDIRECT, when the peer enables driver XDP.
>>
>> Note that when the peer veth device does not set driver xdp, redirected
>> packets will be dropped because the peer is not ready for NAPI.
>
> Often we can't redirect to devices which don't have am xdp program
> installed. In your case we can't redirect unless the peer of the
> target doesn't have a program installed? :(
Right. I tried to avoid this case by converting xdp_frames to skb but
realized that should not be done.
https://patchwork.ozlabs.org/patch/903536/
> Perhaps it is time to reconsider what Saeed once asked for, a flag or
> attribute to enable being the destination of a XDP_REDIRECT.
Yes, something will be necessary. Jesper said Tariq had some ideas to
implement it.
>
>> v2:
>> - Drop the part converting xdp_frame into skb when XDP is not enabled.
>> - Implement bulk interface of ndo_xdp_xmit.
>> - Implement XDP_XMIT_FLUSH bit and drop ndo_xdp_flush.
>>
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@....ntt.co.jp>
>> ---
>> drivers/net/veth.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 45 insertions(+)
>>
>> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
>> index 4be75c58bc6a..57187e955fea 100644
>> --- a/drivers/net/veth.c
>> +++ b/drivers/net/veth.c
>> @@ -17,6 +17,7 @@
>> #include <net/rtnetlink.h>
>> #include <net/dst.h>
>> #include <net/xfrm.h>
>> +#include <net/xdp.h>
>> #include <linux/veth.h>
>> #include <linux/module.h>
>> #include <linux/bpf.h>
>> @@ -125,6 +126,11 @@ static void *veth_ptr_to_xdp(void *ptr)
>> return (void *)((unsigned long)ptr & ~VETH_XDP_FLAG);
>> }
>>
>> +static void *veth_xdp_to_ptr(void *ptr)
>> +{
>> + return (void *)((unsigned long)ptr | VETH_XDP_FLAG);
>> +}
>> +
>> static void veth_ptr_free(void *ptr)
>> {
>> if (veth_is_xdp_frame(ptr))
>> @@ -267,6 +273,44 @@ static struct sk_buff *veth_build_skb(void *head, int headroom, int len,
>> return skb;
>> }
>>
>> +static int veth_xdp_xmit(struct net_device *dev, int n,
>> + struct xdp_frame **frames, u32 flags)
>> +{
>> + struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
>> + struct net_device *rcv;
>> + int i, drops = 0;
>> +
>> + if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
>> + return -EINVAL;
>> +
>> + rcv = rcu_dereference(priv->peer);
>> + if (unlikely(!rcv))
>> + return -ENXIO;
>> +
>> + rcv_priv = netdev_priv(rcv);
>> + /* xdp_ring is initialized on receive side? */
>> + if (!rcu_access_pointer(rcv_priv->xdp_prog))
>> + return -ENXIO;
>> +
>> + spin_lock(&rcv_priv->xdp_ring.producer_lock);
>> + for (i = 0; i < n; i++) {
>> + struct xdp_frame *frame = frames[i];
>> + void *ptr = veth_xdp_to_ptr(frame);
>> +
>> + if (unlikely(xdp_ok_fwd_dev(rcv, frame->len) ||
>> + __ptr_ring_produce(&rcv_priv->xdp_ring, ptr))) {
>
> Would you mind sparing a few more words how this is safe vs the
> .ndo_close() on the peer? Personally I'm a bit uncomfortable with the
> IFF_UP check in xdp_ok_fwd_dev(), I'm not sure what's supposed to
> guarantee the device doesn't go down right after that check, or is
> already down, but netdev->flags are not atomic...
>
>> + xdp_return_frame_rx_napi(frame);
>> + drops++;
>> + }
>> + }
>> + spin_unlock(&rcv_priv->xdp_ring.producer_lock);
>> +
>> + if (flags & XDP_XMIT_FLUSH)
>> + __veth_xdp_flush(rcv_priv);
>> +
>> + return n - drops;
>> +}
>> +
>> static struct sk_buff *veth_xdp_rcv_one(struct veth_priv *priv,
>> struct xdp_frame *frame)
>> {
>> @@ -760,6 +804,7 @@ static const struct net_device_ops veth_netdev_ops = {
>> .ndo_features_check = passthru_features_check,
>> .ndo_set_rx_headroom = veth_set_rx_headroom,
>> .ndo_bpf = veth_xdp,
>> + .ndo_xdp_xmit = veth_xdp_xmit,
>> };
>>
>> #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \
>
>
>
--
Toshiaki Makita
Powered by blists - more mailing lists