[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMnMrOAZ1iGocDDhVmoeY33fxZjiUEQc4yp0KJj8nASrAA@mail.gmail.com>
Date: Wed, 28 Feb 2024 13:23:00 -0500
From: Jamal Hadi Salim <jhs@...atatu.com>
To: John Fastabend <john.fastabend@...il.com>
Cc: netdev@...r.kernel.org, deb.chatterjee@...el.com, anjali.singhai@...el.com,
namrata.limaye@...el.com, tom@...anda.io, mleitner@...hat.com,
Mahesh.Shirshyad@....com, Vipin.Jain@....com, tomasz.osinski@...el.com,
jiri@...nulli.us, xiyou.wangcong@...il.com, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, vladbu@...dia.com,
horms@...nel.org, khalidm@...dia.com, toke@...hat.com, daniel@...earbox.net,
victor@...atatu.com, pctammela@...atatu.com, dan.daly@...el.com,
andy.fingerhut@...il.com, chris.sommers@...sight.com, mattyk@...dia.com,
bpf@...r.kernel.org
Subject: Re: [PATCH net-next v12 00/15] Introducing P4TC (series 1)
On Wed, Feb 28, 2024 at 12:11 PM John Fastabend
<john.fastabend@...il.com> wrote:
>
> Jamal Hadi Salim wrote:
> > This is the first patchset of two. In this patch we are submitting 15 which
> > cover the minimal viable P4 PNA architecture.
> >
> > __Description of these Patches__
> >
> > Patch #1 adds infrastructure for per-netns P4 actions that can be created on
> > as need basis for the P4 program requirement. This patch makes a small incision
> > into act_api. Patches 2-4 are minimalist enablers for P4TC and have no
> > effect the classical tc action (example patch#2 just increases the size of the
> > action names from 16->64B).
> > Patch 5 adds infrastructure support for preallocation of dynamic actions.
> >
> > The core P4TC code implements several P4 objects.
> > 1) Patch #6 introduces P4 data types which are consumed by the rest of the code
> > 2) Patch #7 introduces the templating API. i.e. CRUD commands for templates
> > 3) Patch #8 introduces the concept of templating Pipelines. i.e CRUD commands
> > for P4 pipelines.
> > 4) Patch #9 introduces the action templates and associated CRUD commands.
> > 5) Patch #10 introduce the action runtime infrastructure.
> > 6) Patch #11 introduces the concept of P4 table templates and associated
> > CRUD commands for tables.
> > 7) Patch #12 introduces runtime table entry infra and associated CU commands.
> > 8) Patch #13 introduces runtime table entry infra and associated RD commands.
> > 9) Patch #14 introduces interaction of eBPF to P4TC tables via kfunc.
> > 10) Patch #15 introduces the TC classifier P4 used at runtime.
> >
> > Daniel, please look again at patch #15.
> >
> > There are a few more patches (5) not in this patchset that deal with test
> > cases, etc.
> >
> > What is P4?
> > -----------
> >
> > The Programming Protocol-independent Packet Processors (P4) is an open source,
> > domain-specific programming language for specifying data plane behavior.
> >
> > The current P4 landscape includes an extensive range of deployments, products,
> > projects and services, etc[9][12]. Two major NIC vendors, Intel[10] and AMD[11]
> > currently offer P4-native NICs. P4 is currently curated by the Linux
> > Foundation[9].
> >
> > On why P4 - see small treatise here:[4].
> >
> > What is P4TC?
> > -------------
> >
> > P4TC is a net-namespace aware P4 implementation over TC; meaning, a P4 program
> > and its associated objects and state are attachend to a kernel _netns_ structure.
> > IOW, if we had two programs across netns' or within a netns they have no
> > visibility to each others objects (unlike for example TC actions whose kinds are
> > "global" in nature or eBPF maps visavis bpftool).
>
> [...]
>
> Although I appreciate a good amount of work went into building above I'll
> add my concerns here so they are not lost. These are architecture concerns
> not this line of code needs some tweak.
>
> - It encodes a DSL into the kernel. Its unclear how we pick which DSL gets
> pushed into the kernel and which do not. Do we take any DSL folks can code
> up?
> I would prefer a lower level intermediate langauge. My view is this is
> a lesson we should have learned from OVS. OVS had wider adoption and
> still struggled in some ways my belief is this is very similar to OVS.
> (Also OVS was novel/great at a lot of things fwiw.)
>
> - We have a general purpose language in BPF that can implement the P4 DSL
> already. I don't see any need for another set of code when the end goal
> is running P4 in Linux network stack is doable. Typically we reject
> duplicate things when they don't have concrete benefits.
>
> - P4 as a DSL is not optimized for general purpose CPUs, but
> rather hardware pipelines. Although it can be optimized for CPUs its
> a harder problem. A review of some of the VPP/DPDK work here is useful.
>
> - P4 infrastructure already has a p4c backend this is adding another P4
> backend instead of getting the rather small group of people to work on
> a single backend we are now creating another one.
>
> - Common reasons I think would justify a new P4 backend and implementation
> would be: speed efficiency, or expressiveness. I think this
> implementation is neither more efficient nor more expressive. Concrete
> examples on expressiveness would be interesting, but I don't see any.
> Loops were mentioned once but latest kernels have loop support.
>
> - The main talking point for many slide decks about p4tc is hardware
> offload. This seems like the main benefit of pushing the P4 DSL into the
> kernel. But, we have no hw implementation, not even a vendor stepping up
> to comment on this implementation and how it will work for them. HW
> introduces all sorts of interesting problems that I don't see how we
> solve in this framework. For example a few off the top of my head:
> syncing current state into tc, how does operator program tc inside
> constraints, who writes the p4 models for these hardware devices, do
> they fit into this 'tc' infrastructure, partial updates into hardware
> seems unlikely to work for most hardware, ...
>
> - The kfuncs are mostly duplicates of map ops we already have in BPF API.
> The motivation by my read is to use netlink instead of bpf commands. I
> don't agree with this, optimizing for some low level debug a developer
> uses is the wrong design space. Actual users should not be deploying
> this via ssh into boxes. The workflow will not scale and really we need
> tooling and infra to land P4 programs across the network. This is orders
> of more pain if its an endpoint solution and not a middlebox/switch
> solution. As a switch solution I don't see how p4tc sw scales to even TOR
> packet rates. So you need tooling on top and user interact with the
> tooling not the Linux widget/debugger at the bottom.
>
> - There is no performance analysis: The comment was functionality before
> performance which I disagree with. If it was a first implementation and
> we didn't have a way to do P4 DSL already than I might agree, but here
> we have an existing solution so it should be at least as good and should
> be better than existing backend. A software datapath adoption is going
> to be critically based on performance. I don't see taking even a 5% hit
> when porting over to P4 from existing datapath.
>
> Commentary: I think its 100% correct to debate how the P4 DSL is
> implemented in the kernel. I can't see why this is off limits somehow this
> patch set proposes an approach there could be many approaches. BPF comes up
> not because I'm some BPF zealot that needs P4 DSL in BPF, but because it
> exists today there is even a P4 backend. Fundamentally I don't see the
> value add we get by creating two P4 pipelines this is going to create
> duplication all the way up to the P4 tooling/infra through to the kernel.
> From your side you keep saying I'm bike shedding and demanding BPF, but
> from my perspective your introducing another entire toolchain simply
> because you want some low level debug commands that 99% of P4 users should
> not be using or caring about.
>
> To try and be constructive some things that would change my mind would
> be a vendor showing how hardware can be used. This would be compelling.
> Or performance showing its somehow gets a more performant implementation.
> Or lastly if the current p4c implementation is fundamentally broken
> somehow.
>
John,
With all due respect we are going back again over the same points,
recycled many times over to which i have responded to you many times.
It's gettting tiring. This is exactly why i called it bikeshedding.
Let's just agree to disagree.
cheers,
jamal
> Thanks
> John
Powered by blists - more mailing lists