lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 20 May 2024 11:34:52 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Paolo Abeni <pabeni@...hat.com>, Network Development <netdev@...r.kernel.org>, deb.chatterjee@...el.com, 
	Anjali Singhai Jain <anjali.singhai@...el.com>, namrata.limaye@...el.com, tom@...anda.io, 
	Marcelo Ricardo Leitner <mleitner@...hat.com>, Mahesh.Shirshyad@....com, tomasz.osinski@...el.com, 
	Jiri Pirko <jiri@...nulli.us>, Cong Wang <xiyou.wangcong@...il.com>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Vlad Buslov <vladbu@...dia.com>, Simon Horman <horms@...nel.org>, 
	khalidm@...dia.com, Toke Høiland-Jørgensen <toke@...hat.com>, 
	victor@...atatu.com, Pedro Tammela <pctammela@...atatu.com>, Vipin.Jain@....com, 
	dan.daly@...el.com, andy.fingerhut@...il.com, chris.sommers@...sight.com, 
	mattyk@...dia.com, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH net-next v16 00/15] Introducing P4TC (series 1)

On Fri, Apr 26, 2024 at 2:03 PM Jamal Hadi Salim <jhs@...atatu.com> wrote:
>
> On Fri, Apr 26, 2024 at 1:43 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > On Fri, Apr 26, 2024 at 10:21 AM Paolo Abeni <pabeni@...hat.com> wrote:
> > >
> > > On Fri, 2024-04-26 at 13:12 -0400, Jamal Hadi Salim wrote:
> > > > On Fri, Apr 19, 2024 at 2:01 PM Jamal Hadi Salim <jhs@...atatu.com> wrote:
> > > > >
> > > > > On Fri, Apr 19, 2024 at 1:20 PM Paolo Abeni <pabeni@...hat.com> wrote:
> > > > > >
> > > > > > On Fri, 2024-04-19 at 08:08 -0400, Jamal Hadi Salim wrote:
> > > > > > > On Thu, Apr 11, 2024 at 12:24 PM Jamal Hadi Salim <jhs@...atatu.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Apr 11, 2024 at 10:07 AM Paolo Abeni <pabeni@...hat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, 2024-04-10 at 10:01 -0400, Jamal Hadi Salim wrote:
> > > > > > > > > > The only change that v16 makes is to add a nack to patch 14 on kfuncs
> > > > > > > > > > from Daniel and John. We strongly disagree with the nack; unfortunately I
> > > > > > > > > > have to rehash whats already in the cover letter and has been discussed over
> > > > > > > > > > and over and over again:
> > > > > > > > >
> > > > > > > > > I feel bad asking, but I have to, since all options I have here are
> > > > > > > > > IMHO quite sub-optimal.
> > > > > > > > >
> > > > > > > > > How bad would be dropping patch 14 and reworking the rest with
> > > > > > > > > alternative s/w datapath? (I guess restoring it from oldest revision of
> > > > > > > > > this series).
> > > > > > > >
> > > > > > > >
> > > > > > > > We want to keep using ebpf  for the s/w datapath if that is not clear by now.
> > > > > > > > I do not understand the obstructionism tbh. Are users allowed to use
> > > > > > > > kfuncs as part of infra or not? My understanding is yes.
> > > > > > > > This community is getting too political and my worry is that we have
> > > > > > > > corporatism creeping in like it is in standards bodies.
> > > > > > > > We started by not using ebpf. The same people who are objecting now
> > > > > > > > went up in arms and insisted we use ebpf. As a member of this
> > > > > > > > community, my motivation was to meet them in the middle by
> > > > > > > > compromising. We invested another year to move to that middle ground.
> > > > > > > > Now they are insisting we do not use ebpf because they dont like our
> > > > > > > > design or how we are using ebpf or maybe it's not a use case they have
> > > > > > > > any need for or some other politics. I lost track of the moving goal
> > > > > > > > posts. Open source is about solving your itch. This code is entirely
> > > > > > > > on TC, zero code changed in ebpf core. The new goalpost is based on
> > > > > > > > emotional outrage over use of functions. The whole thing is getting
> > > > > > > > extremely toxic.
> > > > > > > >
> > > > > > >
> > > > > > > Paolo,
> > > > > > > Following up since no movement for a week now;->
> > > > > > > I am going to give benefit of doubt that there was miscommunication or
> > > > > > > misunderstanding for all the back and forth that has happened so far
> > > > > > > with the nackers. I will provide a summary below on the main points
> > > > > > > raised and then provide responses:
> > > > > > >
> > > > > > > 1) "Use maps"
> > > > > > >
> > > > > > > It doesnt make sense for our requirement. The reason we are using TC
> > > > > > > is because a) P4 has an excellent fit with TC match action paradigm b)
> > > > > > > we are targeting both s/w and h/w and the TC model caters well for
> > > > > > > this. The objects belong to TC, shared between s/w, h/w and control
> > > > > > > plane (and netlink is the API). Maybe this diagram would help:
> > > > > > > https://github.com/p4tc-dev/docs/blob/main/images/why-p4tc/p4tc-runtime-pipeline.png
> > > > > > >
> > > > > > > While the s/w part stands on its own accord (as elaborated many
> > > > > > > times), for TC which has offloads, the s/w twin is introduced before
> > > > > > > the h/w equivalent. This is what this series is doing.
> > > > > > >
> > > > > > > 2) "but ... it is not performant"
> > > > > > > This has been brought up in regards to netlink and kfuncs. Performance
> > > > > > > is a lower priority to P4 correctness and expressibility.
> > > > > > > Netlink provides us the abstractions we need, it works with TC for
> > > > > > > both s/w and h/w offload and has a lot of knowledge base for
> > > > > > > expressing control plane APIs. We dont believe reinventing all that
> > > > > > > makes sense.
> > > > > > > Kfuncs are a means to an end - they provide us the gluing we need to
> > > > > > > have an ebpf s/w datapath to the TC objects. Getting an extra
> > > > > > > 10-100Kpps is not a driving factor.
> > > > > > >
> > > > > > > 3) "but you did it wrong, here's how you do it..."
> > > > > > >
> > > > > > > I gave up on responding to this - but do note this sentiment is a big
> > > > > > > theme in the exchanges and consumed most of the electrons. We are
> > > > > > > _never_ going to get any consensus with statements like "tc actions
> > > > > > > are a mistake" or "use tcx".
> > > > > > >
> > > > > > > 4) "... drop the kfunc patch"
> > > > > > >
> > > > > > > kfuncs essentially boil down to function calls. They don't require any
> > > > > > > special handling by the eBPF verifier nor introduce new semantics to
> > > > > > > eBPF. They are similar in nature to the already existing kfuncs
> > > > > > > interacting with other kernel objects such as nf_conntrack.
> > > > > > > The precedence (repeated in conferences and email threads multiple
> > > > > > > times) is: kfuncs dont have to be sent to ebpf list or reviewed by
> > > > > > > folks in the ebpf world. And We believe that rule applies to us as
> > > > > > > well. Either kfuncs (and frankly ebpf) is infrastructure glue or it's
> > > > > > > not.
> > > > > > >
> > > > > > > Now for a little rant:
> > > > > > >
> > > > > > > Open source is not a zero-sum game. Ebpf already coexists with
> > > > > > > netfilter, tc, etc and various subsystems happily.
> > > > > > > I hope our requirement is clear and i dont have to keep justifying why
> > > > > > > P4 or relitigate over and over again why we need TC. Open source is
> > > > > > > about scratching your itch and our itch is totally contained within
> > > > > > > TC. I cant help but feel that this community is getting way too
> > > > > > > pervasive with politics and obscure agendas. I understand agendas, I
> > > > > > > just dont understand the zero-sum thinking.
> > > > > > > My view is this series should still be applied with the nacks since it
> > > > > > > sits entirely on its own silo within networking/TC (and has nothing to
> > > > > > > do with ebpf).
> > > > > >
> > > > > > It's really hard for me - meaning I'll not do that - applying a series
> > > > > > that has been so fiercely nacked, especially given that the other
> > > > > > maintainers are not supporting it.
> > > > > >
> > > > > > I really understand this is very bad for you.
> > > > > >
> > > > > > Let me try to do an extreme attempt to find some middle ground between
> > > > > > this series and the bpf folks.
> > > > > >
> > > > > > My understanding is that the most disliked item is the lifecycle for
> > > > > > the objects allocated via the kfunc(s).
> > > > > >
> > > > > > If I understand correctly, the hard requirement on bpf side is that any
> > > > > > kernel object allocated by kfunc must be released at program unload
> > > > > > time. p4tc postpone such allocation to recycle the structure.
> > > > > >
> > > > > > While there are other arguments, my reading of the past few iterations
> > > > > > is that solving the above node should lift the nack, am I correct?
> > > > > >
> > > > > > Could p4tc pre-allocate all the p4tc_table_entry_act_bpf_kern entries
> > > > > > and let p4a_runt_create_bpf() fail if the pool is empty? would that
> > > > > > satisfy the bpf requirement?
> > > > >
> > > > > Let me think about it and weigh the consequences.
> > > > >
> > > >
> > > > Sorry, was busy evaluating. Yes, we can enforce the memory allocation
> > > > constraints such that when the ebpf program is removed any entries
> > > > added by said ebpf program can be removed from the datapath.
> > >
> > > I suggested the such changes based on my interpretation of this long
> > > and complex discussion, I can have missed some or many relevant points.
> > > @Alexei: could you please double check the above and eventually,
> > > hopefully, confirm that such change would lift your nacked-by?
> >
> > No. The whole design is broken.
> > Remembering what was allocated by kfunc and freeing it later
> > is not fixing the design at all.
>
> Can you be a little less vague?
> We are dealing with multiple domains here _including hw offloads_ and
> as mentioned already, a few times now, for that reason these objects
> belong to the P4TC domain. If it wasnt clear this diagram explains the
> design:
> https://github.com/p4tc-dev/docs/blob/main/images/why-p4tc/p4tc-runtime-pipeline.png
> IOW, P4 objects(to be specific table entries in this discussion) may
> be shared between s/w and/or h/w.
> Note: there is no allocation done by the kfunc - it will just pick
> from a fixed pool of pre-allocated entries. Where is the "design
> broken" considering all this?

Ok, not that i was expecting an answer and i think i have waited long enough.

Frankly my agreement to make the change and the time spent to validate
were just an attempt to make an effort for a compromise (as we have
done many many times) - but really that approach works against our
requirements to control the aging/deletion/replacement policy. I dont
believe there's any good faith from the nackers. For that reason that
offer is off the table.
It should be noted our changes that Alexei is objecting to is more
tame than for example
https://elixir.bootlin.com/linux/latest/source/net/netfilter/nf_conntrack_bpf.c#L318
We didnt see Alexeis nack on that code.

I am not asking to be given speacial treatment but it is clear we have
a hole in the process currently. All i am asking for is fair
treatment. At this point, given that Paolo says the patches cant be
applied because of 3 cross-subsystem nacks, my suggestion on how we
resolve this is to appoint a third person arbitrator. This person
cannot be part of the TC or eBPF collective and has to be agreed to by
both parties.

Hopefully this will introduce some new set of rules that will help the
maintainers resolve such issues should they surface in the future.

I will collect all the other issues raised and my responses and create
a web page so things dont get lost in the noise. I will post then and
maybe send to a wider audience.

cheers,
jamal



> cheers,
> jamal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ