lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMn3-tpDK7jAgh97ZtA5ME1W=oFxgYwHSZ3LG_HbF93FHA@mail.gmail.com>
Date: Wed, 29 May 2024 07:21:43 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: John Fastabend <john.fastabend@...il.com>
Cc: "Singhai, Anjali" <anjali.singhai@...el.com>, "Jain, Vipin" <Vipin.Jain@....com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Alexei Starovoitov <alexei.starovoitov@...il.com>, Network Development <netdev@...r.kernel.org>, 
	"Chatterjee, Deb" <deb.chatterjee@...el.com>, "Limaye, Namrata" <namrata.limaye@...el.com>, 
	tom Herbert <tom@...anda.io>, Marcelo Ricardo Leitner <mleitner@...hat.com>, 
	"Shirshyad, Mahesh" <Mahesh.Shirshyad@....com>, "Osinski, Tomasz" <tomasz.osinski@...el.com>, 
	Jiri Pirko <jiri@...nulli.us>, Cong Wang <xiyou.wangcong@...il.com>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Vlad Buslov <vladbu@...dia.com>, Simon Horman <horms@...nel.org>, Khalid Manaa <khalidm@...dia.com>, 
	Toke Høiland-Jørgensen <toke@...hat.com>, 
	Victor Nogueira <victor@...atatu.com>, "Tammela, Pedro" <pctammela@...atatu.com>, 
	"Daly, Dan" <dan.daly@...el.com>, Andy Fingerhut <andy.fingerhut@...il.com>, 
	"Sommers, Chris" <chris.sommers@...sight.com>, Matty Kadosh <mattyk@...dia.com>, 
	bpf <bpf@...r.kernel.org>, "lwn@....net" <lwn@....net>
Subject: Re: On the NACKs on P4TC patches

On Tue, May 28, 2024 at 7:45 PM John Fastabend <john.fastabend@...il.com> wrote:
>
> Singhai, Anjali wrote:
> > >From: John Fastabend <john.fastabend@...il.com>
> > >Sent: Tuesday, May 28, 2024 1:17 PM
> >
> > >Jain, Vipin wrote:
> > >> [AMD Official Use Only - AMD Internal Distribution Only]
> > >>
> > >> My apologies, earlier email used html and was blocked by the list...
> > >> My response at the bottom as "VJ>"
> > >>
> > >> ________________________________________
> >
> > >Anjali and Vipin is your support for HW support of P4 or a Linux SW implementation of P4. If its for HW support what drivers would we want to support? Can you describe how to program >these devices?
> >
> > >At the moment there hasn't been any movement on Linux hardware P4 support side as far as I can tell. Yes there are some SDKs and build kits floating around for FPGAs. For example >maybe start with what drivers in kernel tree run the DPUs that have this support? I think this would be a productive direction to go if we in fact have hardware support in the works.
> >
> > >If you want a SW implementation in Linux my opinion is still pushing a DSL into the kernel datapath via qdisc/tc is the wrong direction. Mapping P4 onto hardware blocks is fundamentally >different architecture from mapping
> > >P4 onto general purpose CPU and registers. My opinion -- to handle this you need a per architecture backend/JIT to compile the P4 to native instructions.
> > >This will give you the most flexibility to define new constructs, best performance, and lowest overhead runtime. We have a P4 BPF backend already and JITs for most architectures I don't >see the need for P4TC in this context.
> >
> > >If the end goal is a hardware offload control plane I'm skeptical we even need something specific just for SW datapath. I would propose a devlink or new infra to program the device directly >vs overhead and complexity of abstracting through 'tc'. If you want to emulate your device use BPF or user space datapath.
> >
> > >.John
> >
> >
> > John,
> > Let me start by saying production hardware exists i think Jamal posted some links but i can point you to our hardware.
>
> Maybe more direct what Linux drivers support this? That would be
> a good first place to start IMO. Similarly what AMD hardware
> driver supports this. If I have two drivers from two vendors
> with P4 support this is great.
>
> For Intel I assume this is idpf?
>
> To be concrete can we start with Linux driver A and P4 program
> P. Modprobe driver A and push P4 program P so that it does
> something very simple, and drop a CIDR/Port range into a table.
> Perhaps this is so obvious in your community the trouble is in
> the context of a Linux driver its not immediately obvious to me
> and I would suspect its not obvious to many others.
>
> I really think walking through the key steps here would
> really help?
>
>  1. $ p4IntelCompiler p4-dos.p4 -o myp4
>  2. $ modprobe idpf
>  3. $ ping -i eth0 10.0.0.1 // good
>  4. $ p4Load p4-dos.p4
>  5. -- load cidr into the hardware somehow -- p4rt-ctrl?
>  6. $ ping -i eth0 10.0.0.1 // dropped
>
> This is an honest attempt to help fwiw. Questions would be.
>
> For compilation do we need an artifact from Intel it seems
> so from docs. But maybe a typo not sure. I'm not overly stuck
> on it but worth mentioning if folks try to follow your docs.
>
> For 2 I assume this is just normal every day module load nothing
> to see. Does it pop something up in /proc or in firmware or...?
> How do I know its P4 ready?
>
> For 4. How does this actually work? Is it a file in a directory
> the driver pushes into firmware? How does the firmware know
> I've done this? Does the Linux driver already support this?
>
> For 5 (most interesting) how does this work today. How are
> you currently talking to the driver/firmware to insert rules
> and discover the tables? And does the idpf driver do this
> already? Some side channel I guess? This is p4rt-ctrl?
>
> I've seen docs for above in ipdk, but they are a bit hard
> to follow if I'm honest.
>
> I assume IPDK is the source folks talk to when we mention there
> is hardware somewhere. Also it seems there is an IPDK BPF support
> as well which is interesting.
>
> And do you know how the DPDK implementation works? Can we
> learn from them is it just on top of Flow API which we
> could easily use in devlink or some other *link I suspect.
>
> > The hardware devices under discussion are capable of being abstracted using the P4 match-action paradigm so that's why we chose TC.
> > These devices are programmed using the TC/netlink interface i.e the standard TC control-driver ops apply. While it is clear to us that the P4TC abstraction suffices, we are currently discussing details that will cater for all vendors in our biweekly meetings.
> > One big requirement is we want to avoid the flower trap - we dont want to be changing kernel/user/driver code every time we add new datapaths.
>
> I think many 1st order and important points have been skipped. How do you
> program the device is it a firmware blob, a set of firmware commands,
> something that comes to you on device so only vendor sees this? Maybe
> I can infer this from some docs and some examples (by the way I ran
> through some of your DPU docs and such) but its unclear how these
> map onto Linux networking. Jiri started into this earlier and was
> cut off because p4tc was not for hardware offload. Now it is apparently.
>
> P4 is a good DSL for this sure and it has a runtime already specified
> which is great.
>
> This is not a qdisc/tc its an entire hardware pipeline I don't see
> the reason to put it in TC at all.
>
> > We feel P4TC approach is the path to add Linux kernel support.
>
> I disagree with your implementation not your goals to support
> flexible hardware.
>
> >
> > The s/w path is needed as well for several reasons.
> > We need the same P4 program to run either in software or hardware or in both using skip_sw/skip_hw. It could be either in split mode or as an exception path as it is done today in flower or u32. Also it is common now in the P4 community that people define their datapath using their program and will write a control application that works for both hardware and software datapaths. They could be using the software datapath for testing as you said but also for the split/exception path. Chris can probably add more comments on the software datapath.
>
> None of above requires P4TC. For different architectures you
> build optimal backend compilers. You have a Xilenx backend,
> an Intel backend, and a Linux CPU based backend. I see no
> reason to constrain the software case to map to a pipeline
> model for example. Software running on a CPU has very different
> characteristics from something running on a TOR, or FPGA.
> Trying to push all these into one backend "model" will result
> in suboptimal result for every target. At the end of the
> day my .02$, P4 is a DSL it needs a target dependent compiler
> in front of it. I want to optimize my software pipeline the
> compiler should compress tables as much as possible and
> search for a O(1) lookup even if getting that key is somewhat
> expensive. Conversely a TCAM changes the game. An FPGA is
> going to be flexible and make lots of tradeoffs here of which
> I'm not an expert. Also by avoiding loading the DSL into the kernel
> you leave room for others to build new/better/worse DSLs as they
> please.
>
> The P4 community writes control applicatoins on top of the
> runtime spec right? p4rt-ctl being the thing I found. This
> should abstract the endpoint away to work with hardware or
> software or FPGA or anything else.
>

For the record, _every single patchset we have posted_ specified our
requirements as being s/w + h/w. A simpler version of the requirements
is listed here:
https://github.com/p4tc-dev/pushback-patches?tab=readme-ov-file#summary-of-our-requirements

John's content variant above is described in:
https://github.com/p4tc-dev/pushback-patches?tab=readme-ov-file#summary-of-our-requirements
According to him we should not bother with the kernel at all. It's
what is commonly referred to as a monday-morning quarterbacking or
arm-chair lawyering "lets just do it my way and it will all be great".
It's 90% of these discussions and one of the reasons I put up that
page.

cheers,
jamal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ