[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAM0EoM=D5YmMTrmNORcVCbAMVFcq=id_v+FSv-UqR1rT2B0xjg@mail.gmail.com>
Date: Sun, 2 Oct 2022 09:59:37 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Nikolay Aleksandrov <razor@...ckwall.org>
Cc: Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
Johannes Berg <johannes@...solutions.net>,
Pablo Neira Ayuso <pablo@...filter.org>,
Florian Westphal <fw@...len.de>,
Jacob Keller <jacob.e.keller@...el.com>,
Florent Fourcot <florent.fourcot@...irst.fr>,
Guillaume Nault <gnault@...hat.com>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>,
Hangbin Liu <liuhangbin@...il.com>
Subject: Re: [PATCH net-next] docs: netlink: clarify the historical baggage of
Netlink flags
On Fri, Sep 30, 2022 at 2:19 PM Nikolay Aleksandrov <razor@...ckwall.org> wrote:
>
> On 30/09/2022 19:36, Jamal Hadi Salim wrote:
> > On Fri, Sep 30, 2022 at 10:34 AM Nikolay Aleksandrov
> > <razor@...ckwall.org> wrote:
[..]
> > You only have one object type though per netlink request i.e you
> > dont have in the same message fdb and mdb objects?
> >
>
> Yep, it is object-type and family- specific, as is the call itself.
>
Ok, so that makes it easier.
[..]
> > Isnt it sufficient to indicate what objects need to be deleted based on presence
> > of TLVs or the service header for that object?
> >
>
> That was my initial proposal for the fdbs. :) When flush attribute was present it would
> act on it (and filter based on embedded filters). The only non-intuitive part was that it
> happened through SETLINK (changelink), which is a bit strange for a delete op.
>
> >>> Really NLM_F_ROOT and _MATCH are sufficient. The filtering expression is
> >>> the challenge.
> >>
> >> NLM_F_ROOT isn't usable for a DEL expression because its bit is already used by NLM_F_NONREC
> >> and it wouldn't be nice to change meaning of the bit based on the subsystem. NLM_F_MATCH's bit
> >> actually matches NLM_F_BULK :)
> >>
> >
> > Ouch. Ok, it got messy over time i guess. We probably should have
> > spent more time
> > discussing NLM_F_NONREC since it has a single user with very specific
> > need and it
> > got imposed on all.
> > I get your point - i am still not sure if a global flag is the right answer.
> >
>
> Personally, I prefer the complete netlink approach (tlvs describing the operation and filters).
> In the end the flag was close enough, I kept all of the family specific code the same just the entry
> point was different and other families could use it as a modifier to their del commands.
>
BTW, it seems that nftables is an outlier. You should still be able to
use NLM_F_ROOT
acronmy for DELETE.
act_api uses NLM_F_ROOT on delete to flush the whole table of actions. My
git-archealogy-foo says since 2005. NLM_F_NONREC was added in 2017.
So you really should just be able to use NLM_F_ROOT to check for Delete
of the whole table and TLV specific to service to filter further.
> >>
> >> Sometime back I played with a different idea - expressing the filters with the existing TLV objects
> >> so whatever can be specified by user-space can also be used as a filter (also for filtering
> >> dump requests) with some introspection. The lua idea sounds nice though.
> >
> > So what is the content of the TLV in that case?
>
> My first approach, which wasn't using bpf, used the tlv type to define specific filters on the various
> types, incl. binary (which at the time was only an exact match, could be improved though). BPF w/ btf
> would be the obvious choice these days.
>
The filter TLVs are good because the rest of the world can use them.
The challenge is experessability. Like you say above,
exact match is easy; inet diag has its own DSL to describe things which could
be easily extended. A solution like a Lua script is second best and of course
not to rule out ebpf - but that requires more skills.
> > I think ebpf may work with some acrobatics. We did try classical ebpf and it was
> > messy. Note for scaling, this is not just about Delete and Get but
> > also for generated
> > events, where one can send to the kernel a filter so they dont see a broadcast
>
> Yeah, I remember CL having scaling issues in some user-space software that was snooping
> netlink messages and that's the reason I looked into filtering at that time.
>
They are related problems. When you have 1000s of potential events it
just doesnt scale.
My idea was to specify a filter to select a subset and then open
multiple sockets
each specifying a different filter subset.
cheers,
jamal
Powered by blists - more mailing lists