[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200506093557.GB102436@dhcp-12-153.nay.redhat.com>
Date: Wed, 6 May 2020 17:35:57 +0800
From: Hangbin Liu <liuhangbin@...il.com>
To: Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
Toke Høiland-Jørgensen <toke@...hat.com>,
Jiri Benc <jbenc@...hat.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Eelco Chaudron <echaudro@...hat.com>, ast@...nel.org,
Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [RFC PATCHv2 bpf-next 1/2] xdp: add a new helper for dev map
multicast support
Hi Lorenzo,
Thanks for the comments, please see replies below.
On Fri, Apr 24, 2020 at 04:19:08PM +0200, Lorenzo Bianconi wrote:
> > +bool dev_in_exclude_map(struct bpf_dtab_netdev *obj, struct bpf_map *map,
> > + int exclude_ifindex)
> > +{
> > + struct bpf_dtab_netdev *in_obj = NULL;
> > + u32 key, next_key;
> > + int err;
> > +
> > + if (!map)
> > + return false;
>
> doing so it seems mandatory to define an exclude_map even if we want just to do
> not forward the packet to the "ingress" interface.
> Moreover I was thinking that we can assume to never forward to in the incoming
> interface. Doing so the code would be simpler I guess. Is there a use case for
> it? (forward even to the ingress interface)
Eelco has help answered one use case: VEPA. Another reason I added this flag
is that the other syscalls like bpf_redirect() or bpf_redirect_map() are
also able to forward to ingress interface. So we need to behave the same
by default.
>
> > +int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
> > + struct bpf_map *map, struct bpf_map *ex_map,
> > + bool exclude_ingress)
> > +{
[...]
> > + }
>
> Do we need to free 'incoming' xdp buffer here? I think most of the drivers assume
> the packet is owned by the stack if xdp_do_redirect returns 0
Yes, we need. I will fix it.
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 7d6ceaa54d21..94d1530e5ac6 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -3473,12 +3473,17 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = {
> > };
> >
> > static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
> > - struct bpf_map *map, struct xdp_buff *xdp)
> > + struct bpf_map *map, struct xdp_buff *xdp,
> > + struct bpf_map *ex_map, bool exclude_ingress)
> > {
> > switch (map->map_type) {
> > case BPF_MAP_TYPE_DEVMAP:
> > case BPF_MAP_TYPE_DEVMAP_HASH:
> > - return dev_map_enqueue(fwd, xdp, dev_rx);
> > + if (fwd)
> > + return dev_map_enqueue(fwd, xdp, dev_rx);
> > + else
> > + return dev_map_enqueue_multi(xdp, dev_rx, map, ex_map,
> > + exclude_ingress);
>
> I guess it would be better to do not make it the default case. Maybe you can
> add a bit in flags to mark it for "multicast"
But how do we distinguish the flag bit with other syscalls? e.g. If we define
0x02 as the "do_multicast" flag. What if other syscalls also used this flag.
Currently __bpf_tx_xdp_map() is only called by xdp_do_redirect(). If there
is a map and no fwd, it must be multicast forward. So we are still safe now.
Maybe we need an update in future.
Thanks
Hangbin
Powered by blists - more mailing lists