[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <199349c3-fd14-1401-86d1-814776372917@gmail.com>
Date: Thu, 11 Oct 2018 12:44:49 -0600
From: David Ahern <dsahern@...il.com>
To: Jamal Hadi Salim <jhs@...atatu.com>,
Sowmini Varadhan <sowmini.varadhan@...cle.com>,
Stephen Hemminger <stephen@...workplumber.org>
Cc: David Ahern <dsahern@...nel.org>, netdev@...r.kernel.org,
davem@...emloft.net
Subject: Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
On 10/11/18 12:05 PM, Jamal Hadi Salim wrote:
> On 2018-10-11 1:04 p.m., David Ahern wrote:
>
>> You can already filter link dumps by kind. How? By passing in the KIND
>> attribute on a dump request. This type of filtering exists for link
>> dumps, neighbor dumps, fdb dumps. Why is there a push to make route
>> dumps different? Why can't they be consistent and use existing semantics?
>
> I think you meant filtering by ifindex in neighbor.
I meant the general API of users passing filter arguments as attributes
to the dump (or values in the header) -- KIND, MASTER, device index,
etc. This is an existing API and existing capability.
> note: I would argue that there are already "adhoc" ways of filtering
> in place, mostly use case driven). Otherwise Sowmini wouldnt have to
> craft that bpf filter. There are netlink users who have none or some
> weird filtering involved. There is no arguement that your approach
> works for rtm. But the rest of the users missing filters will require
> similar kernel changes. Could this be made generic enough to benefit
> other netlink users?
> The problem is there's always one new attribute that would make sense
> for some use case which requires a kernel change ("send me an event only
> if you get link down" or "dump all ports with link down").
>
I disagree with your overall premise of bpf the end-all hammer. It is a
tool but not the only tool. For starters, you are proposing building the
message, run the filter on it, and potentially back the message up to
drop the recently added piece because the filter does not want it
included. That is still wasting a lot of cpu cycles to build and drop. I
am thinking about scale to 1 million routes -- I do not need the dump
loop building a message for 1 million entries only to drop 99% of them.
That is crazy.
The way the kernel manages route tables says I should pass in the table
id as it is a major differentiator on what is returned. From there
lookup the specific table (I need to fix this part per my response to
Andrew), and then only walk it. The existing semantics, capabilities
that exist for other dump commands is the most efficient for some of
these high level, big hammer filters.
What you want gets into the tiniest of details and yes the imagination
can go wild with combinations of filter options. So maybe this scanning
of post-built messages is reasonable *after* the high level sorting is done.
Powered by blists - more mailing lists