netdev - Re: [PATCH net-next RFC 00/20] Introducing P4TC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7dafeb9-4535-9fa6-fb42-d6538b7ecf10@gmail.com>
Date:   Tue, 14 Feb 2023 17:07:47 +0000
From:   Edward Cree <ecree.xilinx@...il.com>
To:     Jamal Hadi Salim <jhs@...atatu.com>,
        Toke Høiland-Jørgensen <toke@...hat.com>
Cc:     Jiri Pirko <jiri@...nulli.us>,
        John Fastabend <john.fastabend@...il.com>,
        Willem de Bruijn <willemb@...gle.com>,
        Stanislav Fomichev <sdf@...gle.com>,
        Jamal Hadi Salim <hadi@...atatu.com>,
        Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
        kernel@...atatu.com, deb.chatterjee@...el.com,
        anjali.singhai@...el.com, namrata.limaye@...el.com,
        khalidm@...dia.com, tom@...anda.io, pratyush@...anda.io,
        xiyou.wangcong@...il.com, davem@...emloft.net, edumazet@...gle.com,
        pabeni@...hat.com, vladbu@...dia.com, simon.horman@...igine.com,
        stefanc@...vell.com, seong.kim@....com, mattyk@...dia.com,
        dan.daly@...el.com, john.andy.fingerhut@...el.com
Subject: Re: [PATCH net-next RFC 00/20] Introducing P4TC

On 30/01/2023 14:06, Jamal Hadi Salim wrote:
> So what are we trying to achieve with P4TC? John, I could have done a
> better job in describing the goals in the cover letter:
> We are going for MAT sw equivalence to what is in hardware. A two-fer
> that is already provided by the existing TC infrastructure.
...
> This hammer already meets our goals.

I'd like to give a perspective from the AMD/Xilinx/Solarflare SmartNIC
 project.  Though I must stress I'm not speaking for that organisation,
 and I wasn't the one writing the P4 code; these are just my personal
 observations based on the view I had from within the project team.
We used P4 in the SN1022's datapath, but encountered a number of
 limitations that prevented a wholly P4-based implementation, in spite
 of the hardware being MAT/CAM flavoured.  Overall I would say that P4
 was not a great fit for the problem space; it was usually possible to
 get it to do what we wanted but only by bending it in unnatural ways.
 (The advantage was, of course, the strong toolchain for compiling it
 into optimised logic on the FPGA; writing the whole thing by hand in
 RTL would have taken far more effort.)
Developing a worthwhile P4-based datapath proved to be something of an
 engineer-time sink; compilation and verification weren't quick, and
 just because your P4 works in a software model doesn't necessarily
 mean it will perform well in hardware.
Thus P4 is, in my personal opinion, a poor choice for end-user/runtime
 behaviour specification, at least for FPGA-flavoured devices.  It
 works okay for a multi-month product development project, is just
 about viable for implementing something like a pipeline plugin, but
 treating it as a fully flexible software-defined datapath is not
 something that will fly.

> I would argue further that in
> the near future a lot of the stuff including transport will eventually
> have to partially or fully move to hardware (see the HOMA keynote for
> a sample space[0]).

I think HOMA is very interesting and I agree hardware doing something
 like it will eventually be needed.  But as you admit, P4TC doesn't
 address that — unsurprising, since the kind of dynamic imperative
 behaviour involved is totally outside P4's wheelhouse.  So maybe I'm
 missing your point here but I don't see why you bring it up.

Ultimately I think trying to expose the underlying hardware as a P4
 platform is the wrong abstraction layer to provide to userspace.
It's trying too hard to avoid protocol ossification, by requiring the
 entire pipeline to be user-definable at a bit level, but in the real
 world if someone wants to deploy a new low-level protocol they'll be
 better off upgrading their kernel and drivers to offload the new
 protocol-specific *feature* onto protocol-agnostic *hardware* than
 trying to develop and validate a P4 pipeline.
It is only protocol ossification in *hardware* that is a problem for
 this kind of thing (not to be confused with the ossification problem
 on a network where you can't use new proto because a middlebox
 somewhere in the path barfs on it); protocol-specific SW APIs are
 only a problem if they result in vendors designing ossified hardware
 (to implement exactly those APIs and nothing else), which hopefully
 we've all learned not to do by now.

On 30/01/2023 03:09, Singhai, Anjali wrote:
> There is also argument that is being made about using ebpf for
> implementing the SW path, may be I am missing the part as to how do
> you offload if not to another general purpose core even if it is not
> as evolved as the current day Xeon's.

I have to be a little circumspect here as I don't know how much we've
 made public, but there are good prospects for FPGA offloads of eBPF
 with high performance.  The instructions can be transformed into a
 pipeline of logic blocks which look nothing like a Von Neumann
 architecture, so can get much better perf/area and perf/power than an
 array of general-purpose cores.
My personal belief (which I don't, alas, have hard data to back up) is
 that this approach will also outperform the 'array of specialised
 packet-processor cores' that many NPU/DPU products are using.

In the situations where you do need a custom datapath (which often
 involve the kind of dynamic behaviour that's not P4-friendly), eBPF
 is, I would say, far superior to P4 as an IR.

-ed