lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZeY7TqCGFR3h36k-@google.com>
Date: Mon, 4 Mar 2024 13:23:17 -0800
From: Stanislav Fomichev <sdf@...gle.com>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: Tom Herbert <tom@...anda.io>, Jakub Kicinski <kuba@...nel.org>, 
	John Fastabend <john.fastabend@...il.com>, anjali.singhai@...el.com, 
	Paolo Abeni <pabeni@...hat.com>, 
	Linux Kernel Network Developers <netdev@...r.kernel.org>, deb.chatterjee@...el.com, namrata.limaye@...el.com, 
	Marcelo Ricardo Leitner <mleitner@...hat.com>, Mahesh.Shirshyad@....com, Vipin.Jain@....com, 
	tomasz.osinski@...el.com, Jiri Pirko <jiri@...nulli.us>, 
	Cong Wang <xiyou.wangcong@...il.com>, davem@...emloft.net, 
	Eric Dumazet <edumazet@...gle.com>, Vlad Buslov <vladbu@...dia.com>, Simon Horman <horms@...nel.org>, 
	Khalid Manaa <khalidm@...dia.com>, 
	"Toke Høiland-Jørgensen" <toke@...hat.com>, Daniel Borkmann <daniel@...earbox.net>, 
	Victor Nogueira <victor@...atatu.com>, pctammela@...atatu.com, dan.daly@...el.com, 
	Andy Fingerhut <andy.fingerhut@...il.com>, chris.sommers@...sight.com, 
	Matty Kadosh <mattyk@...dia.com>, bpf <bpf@...r.kernel.org>
Subject: Re: Hardware Offload discussion WAS(Re: [PATCH net-next v12 00/15]
 Introducing P4TC (series 1)

On 03/03, Jamal Hadi Salim wrote:
> On Sun, Mar 3, 2024 at 1:11 PM Tom Herbert <tom@...anda.io> wrote:
> >
> > On Sun, Mar 3, 2024 at 9:00 AM Jamal Hadi Salim <jhs@...atatu.com> wrote:
> > >
> > > On Sat, Mar 2, 2024 at 10:27 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > > >
> > > > On Sat, 2 Mar 2024 09:36:53 -0500 Jamal Hadi Salim wrote:
> > > > > 2) Your point on:  "integrate later", or at least "fill in the gaps"
> > > > > This part i am probably going to mumble on. I am going to consider
> > > > > more than just doing ACLs/MAT via flower/u32 for the sake of
> > > > > discussion.
> > > > > True, "fill the gaps" has been our model so far. It requires kernel
> > > > > changes, user space code changes etc justifiably so because most of
> > > > > the time such datapaths are subject to standardization via IETF, IEEE,
> > > > > etc and new extensions come in on a regular basis.  And sometimes we
> > > > > do add features that one or two users or a single vendor has need for
> > > > > at the cost of kernel and user/control extension. Given our work
> > > > > process, any features added this way take a long time to make it to
> > > > > the end user.
> > > >
> > > > What I had in mind was more of a DDP model. The device loads it binary
> > > > blob FW in whatever way it does, then it tells the kernel its parser
> > > > graph, and tables. The kernel exposes those tables to user space.
> > > > All dynamic, no need to change the kernel for each new protocol.
> > > >
> > > > But that's different in two ways:
> > > >  1. the device tells kernel the tables, no "dynamic reprogramming"
> > > >  2. you don't need the SW side, the only use of the API is to interact
> > > >     with the device
> > > >
> > > > User can still do BPF kfuncs to look up in the tables (like in FIB),
> > > > but call them from cls_bpf.
> > > >
> > >
> > > This is not far off from what is envisioned today in the discussions.
> > > The main issue is who loads the binary? We went from devlink to the
> > > filter doing the loading. DDP is ethtool. We still need to tie a PCI
> > > device/tc block to the "program" so we can do skip_sw and it works.
> > > Meaning a device that is capable of handling multiple programs can
> > > have multiple blobs loaded. A "program" is mapped to a tc filter and
> > > MAT control works the same way as it does today (netlink/tc ndo).
> > >
> > > A program in P4 has a name, ID and people have been suggesting a sha1
> > > identity (or a signature of some kind should be generated by the
> > > compiler). So the upward propagation could be tied to discovering
> > > these 3 tuples from the driver. Then the control plane targets a
> > > program via those tuples via netlink (as we do currently).
> > >
> > > I do note, using the DDP sample space, currently whatever gets loaded
> > > is "trusted" and really you need to have human knowledge of what the
> > > NIC's parsing + MAT is to send the control. With P4 that is all
> > > visible/programmable by the end user (i am not a proponent of vendors
> > > "shipping" things or calling them for support) - so should be
> > > sufficient to just discover what is in the binary and send the correct
> > > control messages down.
> > >
> > > > I think in P4 terms that may be something more akin to only providing
> > > > the runtime API? I seem to recall they had some distinction...
> > >
> > > There are several solutions out there (ex: TDI, P4runtime) - our API
> > > is netlink and those could be written on top of netlink, there's no
> > > controversy there.
> > > So the starting point is defining the datapath using P4, generating
> > > the binary blob and whatever constraints needed using the vendor
> > > backend and for s/w equivalent generating the eBPF datapath.
> > >
> > > > > At the cost of this sounding controversial, i am going
> > > > > to call things like fdb, fib, etc which have fixed datapaths in the
> > > > > kernel "legacy". These "legacy" datapaths almost all the time have
> > > >
> > > > The cynic in me sometimes thinks that the biggest problem with "legacy"
> > > > protocols is that it's hard to make money on them :)
> > >
> > > That's a big motivation without a doubt, but also there are people
> > > that want to experiment with things. One of the craziest examples we
> > > have is someone who created a P4 program for "in network calculator",
> > > essentially a calculator in the datapath. You send it two operands and
> > > an operator using custom headers, it does the math and responds with a
> > > result in a new header. By itself this program is a toy but it
> > > demonstrates that if one wanted to, they could have something custom
> > > in hardware and/or kernel datapath.
> >
> > Jamal,
> >
> > Given how long P4 has been around it's surprising that the best
> > publicly available code example is "the network calculator" toy.
> 
> Come on Tom ;-> That was just an example of something "crazy" to
> demonstrate freedom. I can run that in any of the P4 friendly NICs
> today. You are probably being facetious - There are some serious
> publicly available projects out there, some of which I quote on the
> cover letter (like DASH).

Shameless plug. I have a more crazy example with bpf:

https://github.com/fomichev/xdp-btc-miner

A good way to ensure all those smartnic cycles are not wasted :-D
I wish we had more nics with xdp bpf offloads :-(

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ