[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1acb76c8-469b-9e45-90c7-e8c29c8ea6ff@gmail.com>
Date: Thu, 8 Oct 2020 14:58:54 -0700
From: David Ahern <dsahern@...il.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>,
daniel@...earbox.net, ast@...com
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next] bpf_fib_lookup: return target ifindex even if
neighbour lookup fails
On 10/8/20 1:57 PM, Toke Høiland-Jørgensen wrote:
> David Ahern <dsahern@...il.com> writes:
>
>> On 10/8/20 7:53 AM, Toke Høiland-Jørgensen wrote:
>>> The bpf_fib_lookup() helper performs a neighbour lookup for the destination
>>> IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
>>> that the BPF program will pass the packet up the stack in this case.
>>> However, with the addition of bpf_redirect_neigh() that can be used instead
>>> to perform the neighbour lookup.
>>>
>>> However, for that we still need the target ifindex, and since
>>> bpf_fib_lookup() already has that at the time it performs the neighbour
>>> lookup, there is really no reason why it can't just return it in any case.
>>> With this fix, a BPF program can do the following to perform a redirect
>>> based on the routing table that will succeed even if there is no neighbour
>>> entry:
>>>
>>> ret = bpf_fib_lookup(skb, &fib_params, sizeof(fib_params), 0);
>>> if (ret == BPF_FIB_LKUP_RET_SUCCESS) {
>>> __builtin_memcpy(eth->h_dest, fib_params.dmac, ETH_ALEN);
>>> __builtin_memcpy(eth->h_source, fib_params.smac, ETH_ALEN);
>>>
>>> return bpf_redirect(fib_params.ifindex, 0);
>>> } else if (ret == BPF_FIB_LKUP_RET_NO_NEIGH) {
>>> return bpf_redirect_neigh(fib_params.ifindex, 0);
>>> }
>>>
>>
>> There are a lot of assumptions in this program flow and redundant work.
>> fib_lookup is generic and allows the caller to control the input
>> parameters. direct_neigh does a fib lookup based on network header data
>> from the skb.
>>
>> I am fine with the patch, but users need to be aware of the subtle details.
>
> Yeah, I'm aware they are not equivalent; the code above was just meant
> as a minimal example motivating the patch. If you think it's likely to
> confuse people to have this example in the commit message, I can remove
> it?
>
I would remove it. Any samples or tests in the kernel repo doing those
back-to-back should have a caveat.
Powered by blists - more mailing lists