[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87d0678b8w.fsf@toke.dk>
Date: Wed, 10 Jun 2020 12:29:35 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Jesper Dangaard Brouer <brouer@...hat.com>,
Hangbin Liu <liuhangbin@...il.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
Jiri Benc <jbenc@...hat.com>,
Eelco Chaudron <echaudro@...hat.com>, ast@...nel.org,
Daniel Borkmann <daniel@...earbox.net>,
Lorenzo Bianconi <lorenzo.bianconi@...hat.com>,
brouer@...hat.com
Subject: Re: [PATCHv4 bpf-next 1/2] xdp: add a new helper for dev map multicast support
Jesper Dangaard Brouer <brouer@...hat.com> writes:
> On Tue, 26 May 2020 22:05:38 +0800
> Hangbin Liu <liuhangbin@...il.com> wrote:
>
>> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
>> index a51d9fb7a359..ecc5c44a5bab 100644
>> --- a/kernel/bpf/devmap.c
>> +++ b/kernel/bpf/devmap.c
> [...]
>
>> +int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
>> + struct bpf_map *map, struct bpf_map *ex_map,
>> + bool exclude_ingress)
>> +{
>> + struct bpf_dtab_netdev *obj = NULL;
>> + struct xdp_frame *xdpf, *nxdpf;
>> + struct net_device *dev;
>> + bool first = true;
>> + u32 key, next_key;
>> + int err;
>> +
>> + devmap_get_next_key(map, NULL, &key);
>> +
>> + xdpf = convert_to_xdp_frame(xdp);
>> + if (unlikely(!xdpf))
>> + return -EOVERFLOW;
>> +
>> + for (;;) {
>> + switch (map->map_type) {
>> + case BPF_MAP_TYPE_DEVMAP:
>> + obj = __dev_map_lookup_elem(map, key);
>> + break;
>> + case BPF_MAP_TYPE_DEVMAP_HASH:
>> + obj = __dev_map_hash_lookup_elem(map, key);
>> + break;
>> + default:
>> + break;
>> + }
>> +
>> + if (!obj || dev_in_exclude_map(obj, ex_map,
>> + exclude_ingress ? dev_rx->ifindex : 0))
>> + goto find_next;
>> +
>> + dev = obj->dev;
>> +
>> + if (!dev->netdev_ops->ndo_xdp_xmit)
>> + goto find_next;
>> +
>> + err = xdp_ok_fwd_dev(dev, xdp->data_end - xdp->data);
>> + if (unlikely(err))
>> + goto find_next;
>> +
>> + if (!first) {
>> + nxdpf = xdpf_clone(xdpf);
>> + if (unlikely(!nxdpf))
>> + return -ENOMEM;
>> +
>> + bq_enqueue(dev, nxdpf, dev_rx);
>> + } else {
>> + bq_enqueue(dev, xdpf, dev_rx);
>
> This looks racy. You enqueue the original frame, and then later
> xdpf_clone it. The original frame might have been freed at that
> point.
This was actually my suggestion; on the assumption that bq_enqueue()
just puts the frame on a list that won't be flushed until we exit the
NAPI loop.
But I guess now that you mention it that bq_enqueue() may flush the
queue, so you're right that this won't work. Sorry about that, Hangbin :/
Jesper, the reason I suggested this was to avoid an "extra" copy (i.e.,
if we have two destinations, ideally we should only clone once instead
of twice). Got any clever ideas for a safe way to achieve this? :)
-Toke
Powered by blists - more mailing lists