[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMk3fAn=ULevpo9R9v9rFV3-_JrSUTbBP-X0GWLEWL4M-w@mail.gmail.com>
Date: Wed, 29 May 2024 07:10:44 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Chris Sommers <chris.sommers@...sight.com>
Cc: Tom Herbert <tom@...anda.io>, "Singhai, Anjali" <anjali.singhai@...el.com>,
John Fastabend <john.fastabend@...il.com>, "Jain, Vipin" <Vipin.Jain@....com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>, Network Development <netdev@...r.kernel.org>,
"Chatterjee, Deb" <deb.chatterjee@...el.com>, "Limaye, Namrata" <namrata.limaye@...el.com>,
Marcelo Ricardo Leitner <mleitner@...hat.com>, "Shirshyad, Mahesh" <Mahesh.Shirshyad@....com>,
"Osinski, Tomasz" <tomasz.osinski@...el.com>, Jiri Pirko <jiri@...nulli.us>,
Cong Wang <xiyou.wangcong@...il.com>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Vlad Buslov <vladbu@...dia.com>, Simon Horman <horms@...nel.org>,
Khalid Manaa <khalidm@...dia.com>, Toke Høiland-Jørgensen <toke@...hat.com>,
Victor Nogueira <victor@...atatu.com>, "Tammela, Pedro" <pctammela@...atatu.com>,
"Daly, Dan" <dan.daly@...el.com>, Andy Fingerhut <andy.fingerhut@...il.com>,
Matty Kadosh <mattyk@...dia.com>, bpf <bpf@...r.kernel.org>, "lwn@....net" <lwn@....net>
Subject: Re: On the NACKs on P4TC patches
Not sure why my email was tagged as html and blocked, but here goes again:
On Tue, May 28, 2024 at 7:43 PM Chris Sommers
<chris.sommers@...sight.com> wrote:
>
> > On Tue, May 28, 2024 at 3:17 PM Singhai, Anjali
> > <anjali.singhai@...el.com> wrote:
> > >
> > > >From: John Fastabend <john.fastabend@...il.com>
> > > >Sent: Tuesday, May 28, 2024 1:17 PM
> > >
> > > >Jain, Vipin wrote:
> > > >> [AMD Official Use Only - AMD Internal Distribution Only]
> > > >>
> > > >> My apologies, earlier email used html and was blocked by the list...
> > > >> My response at the bottom as "VJ>"
> > > >>
> > > >> ________________________________________
> > >
> > > >Anjali and Vipin is your support for HW support of P4 or a Linux SW implementation of P4. If its for HW support what drivers would we want to support? Can you describe how to program >these devices?
> > >
> > > >At the moment there hasn't been any movement on Linux hardware P4 support side as far as I can tell. Yes there are some SDKs and build kits floating around for FPGAs. For example >maybe start with what drivers in kernel tree run the DPUs that have this support? I think this would be a productive direction to go if we in fact have hardware support in the works.
> > >
> > > >If you want a SW implementation in Linux my opinion is still pushing a DSL into the kernel datapath via qdisc/tc is the wrong direction. Mapping P4 onto hardware blocks is fundamentally >different architecture from mapping
> > > >P4 onto general purpose CPU and registers. My opinion -- to handle this you need a per architecture backend/JIT to compile the P4 to native instructions.
> > > >This will give you the most flexibility to define new constructs, best performance, and lowest overhead runtime. We have a P4 BPF backend already and JITs for most architectures I don't >see the need for P4TC in this context.
> > >
> > > >If the end goal is a hardware offload control plane I'm skeptical we even need something specific just for SW datapath. I would propose a devlink or new infra to program the device directly >vs overhead and complexity of abstracting through 'tc'. If you want to emulate your device use BPF or user space datapath.
> > >
> > > >.John
> > >
> > >
> > > John,
> > > Let me start by saying production hardware exists i think Jamal posted some links but i can point you to our hardware.
> > > The hardware devices under discussion are capable of being abstracted using the P4 match-action paradigm so that's why we chose TC.
> > > These devices are programmed using the TC/netlink interface i.e the standard TC control-driver ops apply. While it is clear to us that the P4TC abstraction suffices, we are currently discussing details that will cater for all vendors in our biweekly meetings.
> > > One big requirement is we want to avoid the flower trap - we dont want to be changing kernel/user/driver code every time we add new datapaths.
> > > We feel P4TC approach is the path to add Linux kernel support.
> > >
> > > The s/w path is needed as well for several reasons.
> > > We need the same P4 program to run either in software or hardware or in both using skip_sw/skip_hw. It could be either in split mode or as an exception path as it is done today in flower or u32. Also it is common now in the P4 community that people define their datapath using their program and will write a control application that works for both hardware and software datapaths. They could be using the software datapath for testing as you said but also for the split/exception path. Chris can probably add more comments on the software datapath.
>
> Anjali, thanks for asking. Agreed, I like the flexibility of accommodating a variety of platforms depending upon performance requirements and intended target system. For me, flexibility is important. Some solutions need an inline filter and P4-TC makes it so easy. The fact I will be able to get HW offload means I'm not performance bound. Some other solutions might need DPDK implementation, so P4-DPDK is a choice there as well, and there are acceleration options. Keeping much of the dataplane design in one language (P4) makes it easier for more developers to create products without having to be platform-level experts. As someone who's worked with P4 Tofino, P4-TC, bmv2, etc. I can authoritatively state that all have their proper place.
> >
> > Hi Anjali,
> >
> > Are there any use cases of P4-TC that don't involve P4 hardware? If
> > someone wanted to write one off datapath code for their deployment and
> > they didn't have P4 hardware would you suggest that they write they're
> > code in P4-TC? The reason I ask is because I'm concerned about the
> > performance of P4-TC. Like John said, this is mapping code that is
> > intended to run in specialized hardware into a CPU, and it's also
> > interpreted execution in TC. The performance numbers in
> > https://urldefense.com/v3/__https://github.com/p4tc-dev/docs/blob/main/p4-conference-2023/2023P4WorkshopP4TC.pdf__;!!I5pVk4LIGAfnvw!mHilz4xBMimnfapDG8BEgqOuPw_Mn-KiMHb-aNbl8nB8TwfOfSleeIANiNRFQtTc5zfR0aK1TE2J8lT2Fg$
> > seem to show that P4-TC has about half the performance of XDP. Even
> > with a lot of work, it's going to be difficult to substantially close
> > that gap.
>
> AFAIK P4-TC can emit XDP or eBPF code depending upon the situation, someone more knowledgeable should chime in.
> However, I don't agree that comparing the speeds of XDP vs. P4-TC should even be a deciding factor.
> If P4-TC is good enough for a lot of applications, that is fine by me and over time it'll only get better.
> If we held back every innovation because it was slower than something else, progress would suffer.
Yes, XDP can be emitted based on compiler options (and was a
motivation factor in considering use of eBPF). Tom's comment above
seems to confuse the fact that XDP tends to be faster than TC with
eBPF as the fault of P4TC.
In any case this statement falls under:
https://github.com/p4tc-dev/pushback-patches?tab=readme-ov-file#2b-comment-but--it-is-not-performant
On Tom's theory that the vendors are going to push inferior s/w for
the sake of selling h/w - I would argues that we are not in the 90s
anymore and I dont believe there's any vendor conspiracy theory here
;-> a single port can do 100s of Gbps, and of course if you want to do
high speed you need to offload, no general purpose CPU will save you.
And really the arguement that "offload=evil" holds no water anymore.
cheers,
jamal
Powered by blists - more mailing lists