[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b3516da-0ba5-0bbf-8de1-e1232457a5aa@gmail.com>
Date: Mon, 2 Aug 2021 21:51:29 -0600
From: David Ahern <dsahern@...il.com>
To: Lahav Schlesinger <lschlesinger@...venets.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net, kuba@...nel.org,
dsahern@...nel.org
Subject: Re: [PATCH] neigh: Support filtering neighbours for L3 slave
On 8/2/21 2:23 AM, Lahav Schlesinger wrote:
> On Sun, Aug 01, 2021 at 11:50:16AM -0600, David Ahern wrote:
>> On 8/1/21 3:01 AM, Lahav Schlesinger wrote:
>>> Currently there's support for filtering neighbours for interfaces which
>>> are in a specific VRF (passing the VRF interface in 'NDA_MASTER'), but
>>> there's not support for filtering interfaces which are not in an L3
>>> domain (the "default VRF").
>>>
>>> This means userspace is unable to show/flush neighbours in the default VRF
>>> (in contrast to a "real" VRF - Using "ip neigh show vrf <vrf_dev>").
>>>
>>> Therefore for userspace to be able to do so, it must manually iterate
>>> over all the interfaces, check each one if it's in the default VRF, and
>>> if so send the matching flush/show message.
>>>
>>> This patch adds the ability to do so easily, by passing a dummy value as
>>> the 'NDA_MASTER' ('NDV_NOT_L3_SLAVE').
>>> Note that 'NDV_NOT_L3_SLAVE' is a negative number, meaning it is not a valid
>>> ifindex, so it doesn't break existing programs.
>>>
>>> I have a patch for iproute2 ready for adding this support in userspace.
>>>
>>> Signed-off-by: Lahav Schlesinger <lschlesinger@...venets.com>
>>> Cc: David S. Miller <davem@...emloft.net>
>>> Cc: Jakub Kicinski <kuba@...nel.org>
>>> Cc: David Ahern <dsahern@...nel.org>
>>> ---
>>> include/uapi/linux/neighbour.h | 2 ++
>>> net/core/neighbour.c | 3 +++
>>> 2 files changed, 5 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
>>> index dc8b72201f6c..d4f4c2189c63 100644
>>> --- a/include/uapi/linux/neighbour.h
>>> +++ b/include/uapi/linux/neighbour.h
>>> @@ -196,4 +196,6 @@ enum {
>>> };
>>> #define NFEA_MAX (__NFEA_MAX - 1)
>>>
>>> +#define NDV_NOT_L3_SLAVE (-10)
>>> +
>>> #endif
>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>> index 53e85c70c6e5..b280103b6806 100644
>>> --- a/net/core/neighbour.c
>>> +++ b/net/core/neighbour.c
>>> @@ -2529,6 +2529,9 @@ static bool neigh_master_filtered(struct net_device *dev, int master_idx)
>>> {
>>> struct net_device *master;
>>>
>>> + if (master_idx == NDV_NOT_L3_SLAVE)
>>> + return netif_is_l3_slave(dev);
>>> +
>>> if (!master_idx)
>>> return false;
>>>
>>>
>>
>> you can not special case VRFs like this, and such a feature should apply
>> to links and addresses as well.
>
> Understandable, I'll change it.
> In this case though, how would you advice to efficiently filter
> neighbours for interfaces in the default VRF in userspace (without
> quering the master of every interface that is being dumped)?
> I reckoned that because there's support in iproute2 for filtering based
> on a specific VRF, filtering for the default VRF is a natural extension
iproute2 has support for a link database (ll_cache). You would basically
have to expand the cache to track any master device a link is associated
with and then fill the cache with a link dump first. It's expensive at
scale; the "no stats" filter helps a bit.
This is the reason for kernel side filtering on primary attributes
(coarse grain filtering at reasonable cost).
>
>>
>> One idea is to pass "*_MASTER" as -1 (use "none" keyword for iproute2)
>> and then update kernel side to only return entries if the corresponding
>> device is not enslaved to another device. Unfortunately since I did not
>> check that _MASTER was non-zero in the current code, we can not use 0 as
>> a valid flag for "not enslaved". Be sure to document why -1 is used.
>
> Do you mean the command will look like "ip link show master none"?
> If so, wouldn't this cause an ambiguity if an interface names "none" is present?
>
You could always detect "none" as a valid device name and error out or
use "nomaster" as a way to say "show all devices not enslaved to a
bridge or VRF.
Powered by blists - more mailing lists