[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMnBXnWDQKu5e0z1_zE3yabb2pTnOdLHRVKsChRm+7wxmQ@mail.gmail.com>
Date: Sat, 28 Jan 2023 10:10:00 -0500
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Willem de Bruijn <willemb@...gle.com>
Cc: Stanislav Fomichev <sdf@...gle.com>,
Jamal Hadi Salim <hadi@...atatu.com>,
Jiri Pirko <jiri@...nulli.us>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
kernel@...atatu.com, deb.chatterjee@...el.com,
anjali.singhai@...el.com, namrata.limaye@...el.com,
khalidm@...dia.com, tom@...anda.io, pratyush@...anda.io,
xiyou.wangcong@...il.com, davem@...emloft.net, edumazet@...gle.com,
pabeni@...hat.com, vladbu@...dia.com, simon.horman@...igine.com,
stefanc@...vell.com, seong.kim@....com, mattyk@...dia.com,
dan.daly@...el.com, john.andy.fingerhut@...el.com
Subject: Re: [PATCH net-next RFC 00/20] Introducing P4TC
On Sat, Jan 28, 2023 at 8:37 AM Willem de Bruijn <willemb@...gle.com> wrote:
>
> On Fri, Jan 27, 2023 at 7:48 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> >
> > On Fri, Jan 27, 2023 at 3:27 PM Jamal Hadi Salim <jhs@...atatu.com> wrote:
> > >
> > > On Fri, Jan 27, 2023 at 5:26 PM <sdf@...gle.com> wrote:
> > > >
> > > > On 01/27, Jamal Hadi Salim wrote:
> > > > > On Fri, Jan 27, 2023 at 1:26 PM Jiri Pirko <jiri@...nulli.us> wrote:
> > > > > >
> > > > > > Fri, Jan 27, 2023 at 12:30:22AM CET, kuba@...nel.org wrote:
> > > > > > >On Tue, 24 Jan 2023 12:03:46 -0500 Jamal Hadi Salim wrote:
> > > > > > >> There have been many discussions and meetings since about 2015 in
> > > > > regards to
> > > > > > >> P4 over TC and now that the market has chosen P4 as the datapath
> > > > > specification
> > > > > > >> lingua franca
> > > > > > >
> > > > > > >Which market?
> > > > > > >
> > > > > > >Barely anyone understands the existing TC offloads. We'd need strong,
> > > > > > >and practical reasons to merge this. Speaking with my "have suffered
> > > > > > >thru the TC offloads working for a vendor" hat on, not the "junior
> > > > > > >maintainer" hat.
> > > > > >
> > > > > > You talk about offload, yet I don't see any offload code in this RFC.
> > > > > > It's pure sw implementation.
> > > > > >
> > > > > > But speaking about offload, how exactly do you plan to offload this
> > > > > > Jamal? AFAIK there is some HW-specific compiler magic needed to generate
> > > > > > HW acceptable blob. How exactly do you plan to deliver it to the driver?
> > > > > > If HW offload offload is the motivation for this RFC work and we cannot
> > > > > > pass the TC in kernel objects to drivers, I fail to see why exactly do
> > > > > > you need the SW implementation...
> > > >
> > > > > Our rule in TC is: _if you want to offload using TC you must have a
> > > > > s/w equivalent_.
> > > > > We enforced this rule multiple times (as you know).
> > > > > P4TC has a sw equivalent to whatever the hardware would do. We are
> > > > > pushing that
> > > > > first. Regardless, it has value on its own merit:
> > > > > I can run P4 equivalent in s/w in a scriptable (as in no compilation
> > > > > in the same spirit as u32 and pedit),
> > > > > by programming the kernel datapath without changing any kernel code.
> > > >
> > > > Not to derail too much, but maybe you can clarify the following for me:
> > > > In my (in)experience, P4 is usually constrained by the vendor
> > > > specific extensions. So how real is that goal where we can have a generic
> > > > P4@TC with an option to offload? In my view, the reality (at least
> > > > currently) is that there are NIC-specific P4 programs which won't have
> > > > a chance of running generically at TC (unless we implement those vendor
> > > > extensions).
> > >
> > > We are going to implement all the PSA/PNA externs. Most of these
> > > programs tend to
> > > be set or ALU operations on headers or metadata which we can handle.
> > > Do you have
> > > any examples of NIC-vendor-specific features that cant be generalized?
> >
> > I don't think I can share more without giving away something that I
> > shouldn't give away :-)
> > But IIUC, and I might be missing something, it's totally within the
> > standard for vendors to differentiate and provide non-standard
> > 'extern' extensions.
> > I'm mostly wondering what are your thoughts on this. If I have a p4
> > program depending on one of these externs, we can't sw-emulate it
> > unless we also implement the extension. Are we gonna ask NICs that
> > have those custom extensions to provide a SW implementation as well?
> > Or are we going to prohibit vendors to differentiate that way?
> >
> > > > And regarding custom parser, someone has to ask that 'what about bpf
> > > > question': let's say we have a P4 frontend at TC, can we use bpfilter-like
> > > > usermode helper to transparently compile it to bpf (for SW path) instead
> > > > inventing yet another packet parser? Wrestling with the verifier won't be
> > > > easy here, but I trust it more than this new kParser.
> > > >
> > >
> > > We dont compile anything, the parser (and rest of infra) is scriptable.
> >
> > As I've replied to Tom, that seems like a technicality. BPF programs
> > can also be scriptable with some maps/tables. Or it can be made to
> > look like "scriptable" by recompiling it on every configuration change
> > and updating it on the fly. Or am I missing something?
> >
> > Can we have a P4TC frontend and whenever configuration is updated, we
> > upcall into userspace to compile this whatever p4 representation into
> > whatever bpf bytecode that we then run. No new/custom/scriptable
> > parsers needed.
>
> I would also think that if we need another programmable component in
> the kernel, that this would be based on BPF, and compiled outside the
> kernel.
>
> Is the argument for an explicit TC objects API purely that this API
> can be passed through to hardware, as well as implemented in the
> kernel directly? Something that would be lost if the datapath is
> implement as a single BPF program at the TC hook.
>
We use the skip_sw and skip_hw knobs in tc to indicate whether a
policy is targeting hw or sw. Not sure if you are familiar with it but its
been around (and deployed) for a few years now. So a P4 program
policy can target either.
In regards to the parser - we need a scriptable parser which is offered
by kparser in kernel. P4 doesnt describe how to offload the parser
just the matches and actions; however, as Tom alluded there's nothing
that obstructs us offer the same tc controls to offload the parser or pieces
of it.
cheers,
jamal
> Can you elaborate some more why this needs yet another in-kernel
> parser separate from BPF? The flow dissection case is solved fine by
> the BPF flow dissector. (I also hope one day the kernel can load a BPF
> dissector by default and we avoid the majority of the unsafe C code
> entirely.)
Powered by blists - more mailing lists