[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82ee1013ca0164053e9fb1259eaf676343c430e8.camel@redhat.com>
Date: Fri, 26 Apr 2024 19:21:31 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: netdev@...r.kernel.org, deb.chatterjee@...el.com,
anjali.singhai@...el.com, namrata.limaye@...el.com, tom@...anda.io,
mleitner@...hat.com, Mahesh.Shirshyad@....com, tomasz.osinski@...el.com,
jiri@...nulli.us, xiyou.wangcong@...il.com, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, vladbu@...dia.com, horms@...nel.org,
khalidm@...dia.com, toke@...hat.com, victor@...atatu.com,
pctammela@...atatu.com, Vipin.Jain@....com, dan.daly@...el.com,
andy.fingerhut@...il.com, chris.sommers@...sight.com, mattyk@...dia.com,
bpf@...r.kernel.org, Jamal Hadi Salim <jhs@...atatu.com>
Subject: Re: [PATCH net-next v16 00/15] Introducing P4TC (series 1)
On Fri, 2024-04-26 at 13:12 -0400, Jamal Hadi Salim wrote:
> On Fri, Apr 19, 2024 at 2:01 PM Jamal Hadi Salim <jhs@...atatu.com> wrote:
> >
> > On Fri, Apr 19, 2024 at 1:20 PM Paolo Abeni <pabeni@...hat.com> wrote:
> > >
> > > On Fri, 2024-04-19 at 08:08 -0400, Jamal Hadi Salim wrote:
> > > > On Thu, Apr 11, 2024 at 12:24 PM Jamal Hadi Salim <jhs@...atatu.com> wrote:
> > > > >
> > > > > On Thu, Apr 11, 2024 at 10:07 AM Paolo Abeni <pabeni@...hat.com> wrote:
> > > > > >
> > > > > > On Wed, 2024-04-10 at 10:01 -0400, Jamal Hadi Salim wrote:
> > > > > > > The only change that v16 makes is to add a nack to patch 14 on kfuncs
> > > > > > > from Daniel and John. We strongly disagree with the nack; unfortunately I
> > > > > > > have to rehash whats already in the cover letter and has been discussed over
> > > > > > > and over and over again:
> > > > > >
> > > > > > I feel bad asking, but I have to, since all options I have here are
> > > > > > IMHO quite sub-optimal.
> > > > > >
> > > > > > How bad would be dropping patch 14 and reworking the rest with
> > > > > > alternative s/w datapath? (I guess restoring it from oldest revision of
> > > > > > this series).
> > > > >
> > > > >
> > > > > We want to keep using ebpf for the s/w datapath if that is not clear by now.
> > > > > I do not understand the obstructionism tbh. Are users allowed to use
> > > > > kfuncs as part of infra or not? My understanding is yes.
> > > > > This community is getting too political and my worry is that we have
> > > > > corporatism creeping in like it is in standards bodies.
> > > > > We started by not using ebpf. The same people who are objecting now
> > > > > went up in arms and insisted we use ebpf. As a member of this
> > > > > community, my motivation was to meet them in the middle by
> > > > > compromising. We invested another year to move to that middle ground.
> > > > > Now they are insisting we do not use ebpf because they dont like our
> > > > > design or how we are using ebpf or maybe it's not a use case they have
> > > > > any need for or some other politics. I lost track of the moving goal
> > > > > posts. Open source is about solving your itch. This code is entirely
> > > > > on TC, zero code changed in ebpf core. The new goalpost is based on
> > > > > emotional outrage over use of functions. The whole thing is getting
> > > > > extremely toxic.
> > > > >
> > > >
> > > > Paolo,
> > > > Following up since no movement for a week now;->
> > > > I am going to give benefit of doubt that there was miscommunication or
> > > > misunderstanding for all the back and forth that has happened so far
> > > > with the nackers. I will provide a summary below on the main points
> > > > raised and then provide responses:
> > > >
> > > > 1) "Use maps"
> > > >
> > > > It doesnt make sense for our requirement. The reason we are using TC
> > > > is because a) P4 has an excellent fit with TC match action paradigm b)
> > > > we are targeting both s/w and h/w and the TC model caters well for
> > > > this. The objects belong to TC, shared between s/w, h/w and control
> > > > plane (and netlink is the API). Maybe this diagram would help:
> > > > https://github.com/p4tc-dev/docs/blob/main/images/why-p4tc/p4tc-runtime-pipeline.png
> > > >
> > > > While the s/w part stands on its own accord (as elaborated many
> > > > times), for TC which has offloads, the s/w twin is introduced before
> > > > the h/w equivalent. This is what this series is doing.
> > > >
> > > > 2) "but ... it is not performant"
> > > > This has been brought up in regards to netlink and kfuncs. Performance
> > > > is a lower priority to P4 correctness and expressibility.
> > > > Netlink provides us the abstractions we need, it works with TC for
> > > > both s/w and h/w offload and has a lot of knowledge base for
> > > > expressing control plane APIs. We dont believe reinventing all that
> > > > makes sense.
> > > > Kfuncs are a means to an end - they provide us the gluing we need to
> > > > have an ebpf s/w datapath to the TC objects. Getting an extra
> > > > 10-100Kpps is not a driving factor.
> > > >
> > > > 3) "but you did it wrong, here's how you do it..."
> > > >
> > > > I gave up on responding to this - but do note this sentiment is a big
> > > > theme in the exchanges and consumed most of the electrons. We are
> > > > _never_ going to get any consensus with statements like "tc actions
> > > > are a mistake" or "use tcx".
> > > >
> > > > 4) "... drop the kfunc patch"
> > > >
> > > > kfuncs essentially boil down to function calls. They don't require any
> > > > special handling by the eBPF verifier nor introduce new semantics to
> > > > eBPF. They are similar in nature to the already existing kfuncs
> > > > interacting with other kernel objects such as nf_conntrack.
> > > > The precedence (repeated in conferences and email threads multiple
> > > > times) is: kfuncs dont have to be sent to ebpf list or reviewed by
> > > > folks in the ebpf world. And We believe that rule applies to us as
> > > > well. Either kfuncs (and frankly ebpf) is infrastructure glue or it's
> > > > not.
> > > >
> > > > Now for a little rant:
> > > >
> > > > Open source is not a zero-sum game. Ebpf already coexists with
> > > > netfilter, tc, etc and various subsystems happily.
> > > > I hope our requirement is clear and i dont have to keep justifying why
> > > > P4 or relitigate over and over again why we need TC. Open source is
> > > > about scratching your itch and our itch is totally contained within
> > > > TC. I cant help but feel that this community is getting way too
> > > > pervasive with politics and obscure agendas. I understand agendas, I
> > > > just dont understand the zero-sum thinking.
> > > > My view is this series should still be applied with the nacks since it
> > > > sits entirely on its own silo within networking/TC (and has nothing to
> > > > do with ebpf).
> > >
> > > It's really hard for me - meaning I'll not do that - applying a series
> > > that has been so fiercely nacked, especially given that the other
> > > maintainers are not supporting it.
> > >
> > > I really understand this is very bad for you.
> > >
> > > Let me try to do an extreme attempt to find some middle ground between
> > > this series and the bpf folks.
> > >
> > > My understanding is that the most disliked item is the lifecycle for
> > > the objects allocated via the kfunc(s).
> > >
> > > If I understand correctly, the hard requirement on bpf side is that any
> > > kernel object allocated by kfunc must be released at program unload
> > > time. p4tc postpone such allocation to recycle the structure.
> > >
> > > While there are other arguments, my reading of the past few iterations
> > > is that solving the above node should lift the nack, am I correct?
> > >
> > > Could p4tc pre-allocate all the p4tc_table_entry_act_bpf_kern entries
> > > and let p4a_runt_create_bpf() fail if the pool is empty? would that
> > > satisfy the bpf requirement?
> >
> > Let me think about it and weigh the consequences.
> >
>
> Sorry, was busy evaluating. Yes, we can enforce the memory allocation
> constraints such that when the ebpf program is removed any entries
> added by said ebpf program can be removed from the datapath.
I suggested the such changes based on my interpretation of this long
and complex discussion, I can have missed some or many relevant points.
@Alexei: could you please double check the above and eventually,
hopefully, confirm that such change would lift your nacked-by?
Thanks!
Paolo
Powered by blists - more mailing lists